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Preface 


Advanced level mathematics syllabuses are once again undergoing changes of 
content and approach, following the revolution in the early 1960s which led 
to the unfortunate dichotomy between ‘modern’ and ‘traditional’ mathema- 
tics. The current trend in syllabuses for Advanced level mathematics now 
being developed and published by many GCE Boards is towards an inte- 
grated approach, taking the best of the topics and approaches of the modern 
and traditional, in an attempt to create a realistic examination target, through 
syllabuses which are maximal for examining and minimal for teaching. In 
addition, resulting from a number of initiatives, core syllabuses are being 
developed for Advanced level mathematics syllabuses, consisting of tech- 
niques of pure mathematics as taught in schools and colleges at this level. 

The concept of a core can be used in several ways, one of which is 
mentioned above, namely the idea of a core syllabus to which options such as 
theoretical mechanics, further pure mathematics and statistics can be added. 
The books in this series are core books involving a different use of the core 
idea. They are books on a range of topics, each of which is central to the 
study of Advanced level mathematics; they form small core studies of their 
own, of topics which together cover the main areas of any single-subject 
mathematics syllabus at Advanced level. 

Particularly at times when economic conditions make the problems of 
acquiring comprehensive textbooks giving complete syllabus coverage acute, 
schools and colleges and individual students can collect as many of the core 
books as they need, one or more, to supplement books already acquired, so 
that the most recent syllabuses of, for example, the London, Cambridge, JMB 
and AEB GCE Boards, can be covered at minimum expense. Alternatively, 
of course, the whole set of core books gives complete syllabus coverage of 
single-subject Advanced level mathematics syllabuses. 

The aim of each book is to develop a major topic of the single-subject 
syllabuses, giving essential book work and worked examples and exercises 
arising from the authors’ vast experience of examining at this level, and also 
including actual past GCE questions. Thus, the core books, as well as being 
suitable for use in either of the above ways, are ideal for supplementing 
comprehensive textbooks in the sense of providing more examples and 
exercises so necessary for preparation and revision for examinations on the 
Advanced level mathematics syllabuses offered by the GCE Boards. 

An attempt has been made to give a readable yet mathematically accurate 
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explanation of the concepts involved in the work on basic probability, many 
of which are first introduced by means of an example. A wide selection of 
examples for the reader to work is included at the end of each chapter, and, 
after each major topic, both worked examples and graded examples on that 
topic for the reader are provided. 

A knowledge of elementary calculus is required for Chapters 2 and 4, 
which contain work on continuous variables, and for certain portions of 
Chapters 2 and 3 the ability to sum simple finite and infinite series is 
necessary. 


Peggy Sabine 
Charles Plumpton 
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1 The laws of probability 


1.1 Introduction 

Let us start by considering two questions: 

(i) I have drawn a card from a well-shuffled pack. Is it a spade? 

(ii) What is the chance that when I draw a card from a well-shuffled pack, it 
will be a spade? 

The first of these questions cannot be answered with certainty without picking 
up the drawn card and looking at it, and we can then only give the answer 
‘Yes’ or ‘No’, whichever is true. To the second question, using ideas of pro- 
bability, we can give at any time a precise numerical answer built on mathe- 
matical foundations. Thus probability deals with problems of the type ‘What 
is the chance that’ some event happens, and the answer, in general, will be a 
numerical quantity (the probability). A scale of measurement is necessary, 
and the scale we use allows probabilities to be measured from 0 (impossibil- 
ity) to 1 (certainty). An example of the former would be the probability that 
you will live for ever, and of the latter the probability that the sun will rise and 
set tomorrow. 

In practice, the probability that it is impossible for something to happen, or 
for it to be certain to happen, is fairly rare, and most probabilities will lie 
between the values 0 and 1. However, some, such as the probability of win- 
ning the football pools, may be so small as to be approximately zero (but not 
actually equal to 0), and others may be very close to 1. 

To return to our initial question (ii), the pack is well shuffled so that each 
card has an equal chance of being picked. There are 52 cards in the pack, of 
which 13 are spades; so there is a chance of 13 out of 52 that the card picked 
will be a spade. That is, there is a chance of 1 in 4, or a probability of that 
the card will be a spade. What, then, is the probability that the card will not 
be a spade? There are 39 cards which are not spades, and so the probability is 
55 , or Denoting ‘probability that the card is a spade’ by P(S) and ‘probability 
that the card is not a spade’ by P (5'), we see that 

P(S) + P (5') = 1 (certainty), 

an obvious result, since it is certain that either a spade or not a spade will be 
picked. The event S' is called the complement of the event S . 

We call the act of picking a card an experiment ; and the results of such 
picking, the possible outcomes or simple events. The set of all possible out- 
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comes of an experiment is the sample (or outcome) space ; this may be either 
discrete , when the simple events can be arranged as a sequence, or continu- 
ous , when the simple events are recorded on a continuous scale. For example, 
the experiment of counting the number of broken eggs in a case of 100 eggs 
will have the discrete sample space {0, 1, 2, . . . , 100 }; for the experiment of 
measuring the height of the surface of a river above or below a given mark, 
the sample space is continuous and is a portion of the real line. For the pre- 
sent, we restrict ourselves to problems in which the sample space is discrete. 
To each sample point of a finite sample space we have a corresponding prob- 
ability, and these probabilities define a probability function. When a fair die 
(die is the singular of dice) is thrown, each of the outcomes 1, 2, 3, 4, 5, 6 has 
a probability of \ and the probability function is 

P(r) = i r = 1, 2, 6. 

We can now generalise the result P(S) + P (S') = 1 which we found for the 
card example. Given that E is an event and denoting by E f the event that E 
does not happen, then 

P (F) + P (£') = 1. 

From the card example we can formulate a method of calculating simple 
probabilities. Given that E is the set of outcomes (all equally likely to occur) 
of an experiment, and given that another event Fis satisfied by a subset Fof 
F, then the probability of event F, P(F), is given by 

n(F) number of ‘successful’ outcomes 

P(F) = — - — — 

n(F) total number of possible outcomes 

For a very simple example, as we saw, we can just write down n(F)/n(F) — 
in our case — but in more complicated problems we may decide, to find 
n(F) and n(F) by using permutations or combinations or both. 

Sometimes, instead of stating the probability of an event happening, we 
state the odds of it happening. When we say that the odds are 1 to 2 that an 
event E will happen, we mean that P(F) = 3. When we say that the odds 
against E happening are n to 1, we mean that P(F') = n/(n + 1) and 
P(F) = 1 /(n + 1). 


Example 1 From a group of 3 boys and 2 girls we wish to select 3 at random 
(that is, each of the 5 has an equal chance of being chosen) to organise a 
school sports day. Find the probability that there will be 1 and only 1 girl 
among the 3. 

Total number of ways of choosing 3 out of 5 = = 10. 

Number of ways of choosing 1 girl = 2. 
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Number of ways of choosing 2 boys = = 3. 

=^> Number of ways of choosing 1 girl and 2 boys = 2x3 = 6. 
=> Probability of 1 and only 1 girl in the chosen 3 = ^ 


We have mentioned already, both in the card problem and in the last 
example, the assumption of ‘equally likely events’. This assumption must, of 
course, be justified when we are calculating probabilities, and if we take the 
probability of getting any number from 1 to 6 (inclusive) when we throw a die 
to be l, then we are assuming that the die is fair or unbiased and that each of 
the numbers 1 to 6 has an equal chance of occurring. Similarly, for a fair coin 
the probability of getting a ‘head’ in a toss is since there are 2 possible 
outcomes (each equally likely, if the coin is a fair one) of which 1 is a ‘head’. 
Unless stated to the contrary, in this book all coins and dice are taken to be 
fair (unbiased), and packs of cards are normal packs of 52 cards. 

Most problems in probability are concerned with the happening not of one 
event only but of two or more. Two events E and F are said to be mutually 
exclusive if they cannot occur together — that is, if P (E D F) = 0. For 
example, a ‘head’ and a ‘tail’ are mutually exclusive events when a coin is 
tossed. 

In the card problem, suppose we wish to find the probability of getting a 
spade (5) or a heart (H). 

P(S u H) - !SM = g „ -L 

v 7 52 52 2 

since 26 cards are either spades or hearts, and 

39 3 

P(S U H U D) = - = - 

since 39 cards are spades or hearts or diamonds ( D ). We see, then, that for 
the mutually exclusive events ‘spade’, ‘heart’, ‘diamond’, we have 

P (S U H) = P(S) + P (//), 

P(S U H U D) = P(S) + P (H) + P(D). 

This suggests an addition law for probabilities. The general form of the law 
for two events E and F is 


P (E U F) = P (E) + P(F) - P (E H F). 

This can best be illustrated by using a Venn diagram (Fig. 1.1). E and F are 
subsets of 5, the outcome space. We require 


P (E U F) = 


n (E U F ) 
n(S) 


n (E) + n(F) - n (E fl F ) 

<S) ’ 


since the subset (E Pi F) shaded is included twice in n(F) + n(F), and, hence, 
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P (E U F) = P (E) + P(F) - P (E H F). 

Now if E and F are mutually exclusive events, then Y(E D F) = 0, and, 
hence, the law becomes 


P(£ U F) = P (E) + P(F), 

as was illustrated by our card example. 

The events E u E 2 , . . ., E n are said to be exhaustive if they cover all 
possible outcomes of an experiment; that is, the n events form the outcome 
space. In general, if E x , E 2 , . . . , E n are mutually exclusive events and they 
are also exhaustive, then we can extend the addition law to obtain 


P (E, U E 2 U . . . U E n ) = P (E x ) + P (E 2 ) + . . . + P (E n ) = 1. 


Example 2 A boy has 12 coins in his pocket — three lOp pieces, three 2p, 
one lp, five 5p. He draws a coin at random from his pocket. Find the 
probability that it is 

(a) either a 2p or a lOp coin, 

(b) a 2p, a 5p or a lOp coin. 

Here the events ‘2p’, ‘5p’, TOp’, Tp’ are mutually exclusive. 

P(10p) = n = 7- p < 2 p> = 7- p(1 p ) = 4 p(5 p ) = S 

(Note here that the sum of all these probabilities must, and does, equal 1, 
certainty.) 

(a) P(2p U lOp) = P(2p) + P(10p) = j + j = 1 

(b) P(2p U 5p U lOp) = P(2p) + P(5p) + P(10p) = J + ^ + 7 = % 
Or, better, P(2p U 5p U lOp) = 1 - P(lp) = 1 - 


4 Probability 



Example 3 From a well-shuffled pack of cards one is picked at random. Find 
the probability that it is either a spade or a king. 

Here the events ‘spade’ (S) and ‘king’ are not mutually exclusive, since they 
can occur together as the king of spades. 

p(5) = I = T P(ki " 8) = ^ P(snking) = i 

Hence, P(S U king) = 1 + 1 - ± = 1 

(Alternatively, we could say that the total number of cards that are either 
spade or king is 13 spades plus 3 other kings — that is, 16 cards in all. Hence, 
P(S U king) = 52 = as before.) 

Example 4 Three items are chosen at random from a lot containing 16 items, 
of which 4 are defective. Find the probability that 

(a) all three items are defective, 

(b) all three items are non-defective, 

(c) at least one item is defective. 


The number of ways of choosing 3 items from 16 items is 

'16\ _ 16 x 15 x 14 
3 / 1x2x3 


= 560 ways. 


(a) The number of ways of choosing 3 defective items from 4 defective 
items is 


3 I = 4 ways. 


4 1 

Probability that all 3 items are defective = 

(b) The number of ways of choosing 3 non-defective items from 12 
non-defective items is 


12 x 11 x 10 
1x2x3 


= 220 ways. 


Probability that all 3 items are non-defective = = — . 

(c) The event that at least 1 item is defective is the complement of the event 
that all 3 items are non-defective and, hence, 

P(all 3 items non-defective) + P(at least 1 item defective) = 1. 


=^> P(at least 1 item defective) = 1 — 


11 


17 

28* 
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Exercise 1.1 

1 A die is weighted so that, when the die is thrown, a ‘6’ is twice as likely to occur as 
each of the other numbers. Find the probability of a ‘6’, and of each of the other 
numbers, occurring. 

2 For the die of question 1, find the probability of obtaining on a single throw 

(a) an even number, 

(b) an odd number, 

(c) a prime number, 

(d) an odd prime number, 

(e) an even number which is not a prime number. [Treat T as prime.] 

3 The events E and F are such that 

P(£') = i P(F n F) = P(£ U F) = l 

Find 

(a) P(£), (b) P(F), (c) P (E n F). 

4 A committee of 3 is chosen at random from a group of 20 people consisting of 12 
men and 8 women. Find the probability that 

(a) 3 men are chosen, 

(b) at least 1 man is chosen. 

1.2 The multiplication law 

We said that, for two mutually exclusive events E and F, F(E fl F) = 0 
by definition. The multiplication law enables us to find an expression for 
P(F fl F ) when E and F are not mutually exclusive. Consider first the case 
where E and F are independent events — that is, where the knowledge of the 
occurrence of one event does not affect the probability of the other event 
occurring. Under this condition the probability that both E and F occur 
simultaneously is equal to the product of the probabilities that E and F occur 
separately — that is, 


P(F H F) = P(F).P(F). 

(This can be extended to more than two independent events: 

P (E H F n G) = P(£).P(F).P(G), etc.) 
For such independent events we could write 

P (F, given that F occurred) = P(F) 


and 


P(F, given that E occurred) = P (F). 

We use the symbol | to shorten this and write 

F(E\F) = P(E), F(F\E) = P(F), 

where E and F are independent events. F(E\F) is called the conditional pro- 
bability of E on F, and is to be read as the probability of E given that F has 
occurred. We can illustrate this by means of another Venn diagram (Fig. 1.2). 
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We will denote the sample space by S , and represent the set of equally likely 
outcomes of event E and the set of equally likely outcomes of event F within S 
as shown. Then P(E|F) is the probability of E occurring given that F has 
occurred and, hence, the sample space is reduced to the set F. 



Fig. 1.2 


Therefore 

r . = <EHF) = n (EOF) n (5) = n(E Pi F) /n(F) = F(Ef)F) 

1 1 ' n(F) n (S) n (F) n (5) / n(5) P(F) 

or 

P(£HF) = F(E\F).F(F) = P(F|£).P(£), 

by symmetry. This is the multiplication rule. If E and F are independent 
events, we know that F(E\F) = P(£) and the rule reduces, for independent 
events, to P(£ fl F) = P(£).P(F), as stated earlier. 

Example 5 A fair coin is tossed twice. Find the probability of obtaining 
exactly one ‘tail’. 

1 1 

P (//) = — , P (T) = — . Events are independent. 

P (HT) = P(H).P(T) = j-.j- = j, 

P (TH) = P(T).P(H) = j, 

P(exactly one ‘tail’) = + -j- = -y. 

Example 6 Data on heights of 930 mothers of only sons and their adult sons 
are given below. Heights for short and tall are divided at T57 m for mothers 
and T75 m for sons. 
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Son 



Short 

Tall 

Total 

Short 

320 

104 

424 

Tall 

180 

326 

506 

Total 

500 

430 

(930) 


Find the probability that 

(a) a tall mother ( TM ) has a tall son ( TS ), 

(b) a short mother (SM) has a tall son, 

(c) a tall son has a short mother. 


Here the events tall or short, mother or son, are not independent. 


(a) P(TS\TM) = 

(b) F(TS\SM ) = 

(c) F(SM\TS) = 


P (TS n TM) 

326/930 

P (TM) 

- 506/930 

P (TS n SM) 

104/930 

P (SM) 

- 424/930 

P (SM n TS) 

104/930 

?(TS) 

“ 430/930 


0-644. 


0-245. 


0-242. 


It is important to notice that P(TS|SM) is not the same as F(SM\TS). The 
former is the probability that, given a short mother, then the son is tall. The 
latter is the probability that given a tall son, the mother is short. 


Example 7 Find the probability of drawing 2 spades when drawing 2 cards 
from a well-shuffled pack. 

Here we must know how the 2 cards are to be drawn; the question alone 
does not give us sufficient information for us to find an answer. We must 
know whether the cards are drawn with replacement or without replacement . 
(a) If we draw a card, then replace it in the pack ( with replacement ), and then 
draw again, we have 


P (S 1 and S 2 ) = P(Si).P(S 2 ), 

since the two events of drawing a spade are independent. That is, 

P(Si and 5 2 ) = 52*55 = 

(b) If we draw a card, do not replace it in the pack, and then draw again 
(without replacement ), the two events of obtaining a spade are now depen- 
dent, for the second draw probability depends on whether or not we have 
drawn a spade on the first draw. 

P(spade on first draw) = P (5^ = 55 = i 

The pack is now down to 51 cards. We need 
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P (S 1 and S 2 ) = F(S X ) .P(S 2 \S 1 ). 

Given that a spade was drawn the first time, there are now 12 spades in the 51 
cards and we have 


P(Si and S 2 ) — 4 * 51 — 17 - 

The same method and result would apply if the 2 cards were drawn simultane- 
ously (as a pair) from the pack. 

As an alternative method we could use 


p *.-*>- (?)/(?) 


17* 


Example 8 In a certain strain of wallflower, the probability that a seed pro- 
duces a plant with yellow flowers is Find the number of seeds that should 
be sown in order that the probability of obtaining at least one plant with 
yellow flowers will be greater than 0*99. 

Let the number of seeds be n. Probability of yellow flowers is Probabil- 
ity of not yellow flowers is 

P(at least one plant with yellow flowers) = 1 - P(none with yellow flowers) 

= l-m">0-99 


- 0-01 >(F 

* (t)' => 100 


=> n ( lg 4 - lg 3) > lg 100 = 2 

=> n > 2 = 16-01 

lg 4 - lg 3 


i.e. 17 seeds must be sown. 


Exercise 1.2 

1 A box contains 4 red buttons and 4 white buttons. Find the probability that, when 
2 buttons are chosen at random and without replacement, 1 will be red and 1 will 
be white. 

2 A committee of 3 is chosen at random from a group of 20 people consisting of 12 
men and 8 women. Find the probability that 

(a) exactly 2 men are chosen, 

(b) exactly 2 women are chosen. 

3 For the two events E and F, P(F) = P(F) = \ and P (E Pi F) = 5. Find 
(a) P(£|F), (b) P(F|F), (c) P(F U F), (d) P(F'|F'). 

4 The independent probabilities that John and Bill hit a target at rifle practice are 5 
and respectively. 
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(a) Given that each of them fires twice, find the probability that the target will 
be hit at least once. 

(b) Given that John can fire only once, find the least number of times that Bill 
must fire so that there is a probability of at least 0-95 that the target will be hit. 

(c) Given that John and Bill each fire once, and the target is hit only once, find 
the probability that it will be John who hits the target. 

1.3 Tree diagrams and Bayes' theorem 

The solution of some probability questions can be made easier by the use of a 
tree diagram showing all the possible outcomes. For this, we represent by the 
‘branches’ of a ‘tree’ all the possible outcomes of the first event, attaching to 
each branch the probability of that particular outcome occurring. Then, from 
the ends of each of these branches, we draw branches representing all the 
possible outcomes of the second event, with their corresponding probabilities 
of occurrence attached. In the same way we continue for the total number of 
events with which we are concerned. If we then follow one (or more) 
particular ‘branch line(s)’, we can find the probability of a particular com- 
bination of outcomes in the events. 

For example, suppose we have two boxes A and B. Box A contains 3 red 
(7?) and 4 white (W) balls. Box B contains 5 red and 3 white balls. We choose 
a box at random and then choose a ball at random from that box. The prob- 
ability tree diagram is shown in Fig. 1.3. 



In box A, P(fl) = l P(W) = 

In box B, P(fl) = P(W) = l 

We then follow along any route from O to the tip of a branch and we obtain 
the probability by multiplying together the probabilities on the branches. 
Hence, 

P(A n R) = i X 2 = 
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It can be seen that, if we have a total of k events, and the numbers of possible 
outcomes are, respectively, n x , n 2 , . . . , n k for the 1st, 2nd, . . . , A;th events, 
then the total number of possible outcomes for the k events is x n 2 x n 3 x 
... x n k . Hence, the number of branches in a tree diagram obviously in- 
creases very rapidly as the number of events increases. 

Example 9 When a person needs a minicab, he hires one from one of three 
firms A, B, or C. Of his hirings, 60% are from firm A, 30% from B and 10% 
from C. From firm A, 9% of the cabs arrive late; from B, 20% arrive late; 
and from C, 6% arrive late. Find 

(a) the probability that a cab chosen at random from those he hires will be 
from firm C and will not be late, 

(b) the probability that a cab hired by the person will be late. 

Late 


Not late 


Late 


Not late 


Late 


Not late 

Fig. 1.4 



Using the tree diagram in Fig. 1.4, we have 

(a) the crossed branch line 

P(C H not late) = 0-1 x 0-94 
= 0-094, 

(b) the sum of the 3 dotted ‘branch lines’ 

=> P(late) = 0-6 x 0-09 + 0-3 x 0-2 + 0-1 x 0-06 
= 0 - 12 . 

Let us look now at a problem where the words seem a hopeless tangle but a 
tree diagram (Fig. 1.5) makes the solution quite simple. 
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Example 10 A headmaster knows that for the science sixth form at his school , 
the odds are 1 to 2 that a pupil will take chemistry (C). If a pupil does take 
chemistry, the odds are 3 to 1 that he or she will take physics (P), and if a 
pupil does not take chemistry, the odds are 7 to 1 that he or she will take 
physics. If a pupil takes both chemistry and physics, the odds are 5 to 1 that he 
or she also takes mathematics (M); if a pupil takes chemistry but not physics, 
the odds are 1 to 4 that he or she also takes mathematics. For a pupil who 
takes neither chemistry nor physics, the odds are 3 to 1 that he or she takes 
mathematics; for a pupil taking physics but not chemistry, the odds are 9 to 1 
that he or she takes mathematics. Find the probability that a pupil chosen at 
random 

(a) from this science sixth form takes mathematics, 

(b) from this science sixth form does not take any of the subjects mathe- 
matics, chemistry or physics, 

(c) from those who take mathematics, also takes chemistry, 

(d) from those who take both mathematics and physics, also takes chemistry. 
Fig. 1.5 shows the tree diagram for this problem. 
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1 , 2 v 7 
5 + 3 X 8 


9 

TO 


3 X B X 4 


3 13 

16* 


(a) P(M) - 3 x | x | 5 x 

(b) P(none of the three subjects) = P(C' fl P' D M') = j x | x | = ^ 


(c) P(C|M) = 


P(C n M) 3(4 x i + 1 x 


(d) P(C|P n M) = 


P(M) 

p(c n p n M) 


13 

16 


18 
65' 

I v 3 v 5 
3*4*6 


p(p n M) 


i V — V — -i- — V — v — 

3 a 4 a 6 ^ 3 x 8 a 10 


25 

88 ' 


Example 11 Let us look again at Example 9 of p. 11, and ask the question: 
‘If a minicab is called and it arrives late, what is the probability that it came 
from firm £’? 


Now P(2?|late) = 


P(Z? D late) 
P(late) 


0-3 x 0-2 
0-12 


0-5. 


[We asked, and answered, the same type of question in Example 10(c) and 

(d).] 

How is the solution built up? 


P(£|late) = 


P (B fl late) 
P(late) 


P(ff).P(late[ff) 

P(late) 


P(ff).P(late[ff) 

P(i4).P(late|A) + P(£).P(late|£) 4- P(C).P(late|C) 


(= 0-5 in our problem). 


We can now state Bayes’ theorem , which expresses this result in symbolic 
form. Given that B x , B 2 , . . . , B n are a mutually exclusive and exhaustive set 
of outcomes of a random process, and E is a chance event (where P(£) =£ 0) 
caused by, or preceded by, one of the events B t , B 2 , . . ., B n , then 

P(BJE) = PW-P(£|B.) 

ip(B,).P(£|B,) 

r= 1 

for k = 1 ,2, . . . , n. 

Proof (of Bayes’ theorem) From the definition of conditional probability, we 
have 


nn in P(S * n £) P («*) P < £ I B *) 

P(B * |£) P (£) m ' 

Since event E is caused by, or preceded by, one of the events B k , k = 1,2, 
. . . , ft, then 

P(£) = p (E n b x ) + p (E n b 2 ) + • •• + p (E n B n ) 
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= 2 P(£ n B r ) 

r= 1 

= lP(B r ).P(E\B r ) 

=> pwi' e ") - : ( ^- p<£|Bt) . 

SP(B r ).P(E|B r ) 

r= 1 

Example 12 In a large company, 15% of the employees are graduates (G), 
and, of these, 80% work in administrative posts (A). Of the non-graduate 
( NG ) employees of the company, 10% work in administrative posts. Find the 
probability that an employee of this company selected at random from those 
working in administrative posts will be a graduate. 


We have 


P(G) = 0-15, P (NG) = 0-85, P(A\NG) = 0-10, P(A|G) = 0-80, 
P(G).P(A|G) = 0-15 x 0*80, and ?(NG) .P(A\NG) = 0-85 x 0-10. 


Bayes’ theorem gives 


?(G\A) = 


P(G).P(AjG) 

P(G).P(A|G) + P(AG).P(A|AG) 


0-15 x 0-80 

0-15 x 0-80 + 0-85 x 0-10 


0-585. 


Exercise 1.3 

1 We have two boxes A and B, where A contains 2 red and 3 white balls, B contains 
5 red and 4 white balls. We toss a coin, and, if we get a head, we take a ball at 
random from box A, but if we get a tail we take a ball at random from box B. We 
toss the coin once. Use a tree diagram to find the probability that we will take a 
red ball. 

2 We have two boxes A and B, where A contains 5 tickets numbered from 1 to 5 
and B contains 7 tickets numbered from 1 to 7. We choose one of the two boxes at 
random and then pick a ticket from it at random. Use a tree diagram or Bayes’ 
theorem to find the probability that the ticket comes from box B, given that its 
number is odd. 

3 Steel girders are manufactured by three factories A, B and C. Each month, 
factory A makes twice as many as factory B, and factories B and C make the same 
number of girders. Of the girders made by factory A and of those made by B, 2% 
are defective; of those made by C, 4% are defective. One month’s production of 
girders of the three factories is put into a warehouse. Given that one girder is 
chosen at random from the warehouse, 

(a) find the probability that this item will be defective, 

(b) given that the girder is defective, find the probability that it comes from 
factory A. 
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4 A box contains 6 batteries, of which 2 are known to be flat. The batteries are 
tested one after the other until the 2 flat batteries are found. Find the probability 
that the 2 flat batteries will be found when just 2 batteries have been tested. Find 
also the probability that the 2 flat batteries will be found only when just 3 batteries 
have been tested. 


1 .4 Markov chains 

Consider the problem of a shopper who is buying margarine (Ma) or butter 
(Bu) (but not both). Suppose that it has been found that, when she buys mar- 
garine on a given shopping outing, there is a probability of 0-8 that she will 
buy margarine the next time she purchases, and a probability of 0*2 that she 
buys butter instead. Also, if she buys butter on the given outing, the prob- 
ability that she buys butter the next time is 0*6 and the probability is 0*4 that 
she buys margarine. Suppose further that, on the first purchase being 
considered, there is an equal chance that she will buy butter or margarine. 
What is the probability that she will buy butter on (a) the third, (b) the ninth 
purchase? We draw the tree diagram, Fig. 1.6. Then 



The laws of probability 1 5 



P (Bu on second purchase) = P (Bu 2 ) = 0*5 x 0*2 + 0*5 x 0*6 = 0*4 
4> P (Ma 2 ) = 1 - 0-4 = 0-6 (or 0-5 x 0-8 + 0-5 x 0-4). 

Also 

P (Bu 3 ) = 0*5 x 0*8 x 0*2 + 0*5 x 0*2 x 0-6 + 0-5 x 0*4 x 0*2 + 0*5 x 0*6 x 0-6 
= 0-36 

=> P (Ma 3 ) = 0-64. 

We could find probabilities for the fourth purchase in the same way, but the 
tree diagram becomes rather large and cumbersome, and for the ninth pur- 
chase it would be quite unmanageable. Let us look at the previous results, 
using the notation P(Bu k ), P(Ma k ) for the probabilities of purchasing butter 
and margarine, respectively, on the kth purchase. 

(P(fl« 2 , P (Ma 2 )) = (0-5(0-6 + 0-2), 0-5(04 + 0-8)) 

= (0-5 0-5)(° 6 0 4 
1 ’ ^0-2 0-8 

- (P (Bu,), p (««.))te ®. j). 


If we denote the probability row vectors (P(Bu k ), P (Ma k )), for k = 1,2, . . . , 
by Pi, p 2 , p 3 , . . we have p 2 = p x M, where 


M = 


0-6 0-4\ 
0-2 0 - 8 / 


Here the matrix M is called the transition matrix, and it is the fixed matrix of 
probabilities which takes us from one state (purchasing occasion in our 
example) to the next state (next purchasing occasion). 

Similarly, 

p 3 = (P (Bu 3 ), P (Afa 3 )) = (0-5(0-6 x 06 + 0-4 x 0-2 + 02x0-6 + 0-8 x 0-2), 

0-5(0-6 x 0-4 + 0-4 x 0-8 + 0-2 x 0-4 + 0-8 x 0-8)) 

= ([0-5(0-6 + 0-2)0-6 + 0-5(0-4 + 0-8)0-2], 

[0-5(0-6 + 0-2)0-4 + 0-5(0-4 + 0-8)0-8]) 

= (05(0-6 + 02), 0-5(0-4 + 0-8))(^ ^ 4 ) 

= p 2 M = p | M 2 . 

It seems probable that 

p 4 = p 3 M = p 2 M 2 = pjM 3 , p 5 = Pl M 4 , etc., 

and in fact these can all be shown by use of the tree diagram and by 
multiplying the probability vectors by M. The general result p* = piM* -1 , 
where A: is a positive integer, may be shown by induction. 

Thus, for P(Z?n 9 ), instead of drawing a very large tree diagram, we could 
get our result by post-multiplying pj by M 8 . 
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The calculations for the vectors pi, p 2 , p 3 , p 4 , Ps, P 6 , P 7 , Ps, P 9 , give 
(0-5, 0-5), (0-4, 0*6), (0-36,0-64), (0-344,0-656), (0-338,0-662), (0-335,0-665), 
(0-334,0-666), (0-3336,0-6664), (0-33344,0*66656), respectively, and it ap- 
pears that pfc may be tending to a limit vector of (5,5). You will notice that, 
because the elements of the row vectors are probabilities, the sum of the 
elements must add up to 1, certainty. If p^+i = p^M = p*. for some value of k , 
then we say that p^ is the equilibrium state p, and this will be the probability 
vector for all further events, since p^+ 2 = p*+iM = p^M = p^ = p. 

This would imply that the probability of buying butter and the probability 
of buying margarine would each be constant for all events from the A;th 
onwards. 

In a problem of this kind, where the probability of the rth event depends 
only on the result of the (r — l)th event, and where the transition matrix is the 
same for each successive pair of events, we are dealing with a Markov chain , 
and using a Markov process. The equilibrium state (or limiting vector), p, can 
be found from the equation pM = p, provided that we know M, and it is, of 
course, independent of the initial probability vector p x . Our example has 
involved row vectors with two elements and M, a 2 x 2 matrix, but a Markov 
process is not, of course, restricted to these. It can be used for M, an n x n 
matrix, and row vectors of n elements, where n is any positive integer. 

Example 13 For the transition matrix (3 ^ I of a Markov process, find 

M V 

the limit to which the probability vector will tend. 


In the equilibrium state 


(pa) 


i 4 
5 5 

3 I 

4 4 


= (P,q), where p + q = l, 

Ip + h = p) 


Ip + \q = q 


^ p = 3 ?. 


q 


— 16 
_ 31- 


[p+ q = ij 

Here we have three equations in two unknowns but the equations are con- 
sistent — that is, the solutions for p and q satisfy all three equations. The limit 
probability vector is ( 31 , 51 ). 


Example 14 If a man leaves home too late to catch his bus to work on any 
day, the probability that he is late the following day is 3, whereas if he leaves 
in time to catch it on any day, then the probability that he is late on the 
following day is 5 . The man catches his bus on Tuesday and leaves for work 
each day. 

(a) Write down the transition matrix. 

(b) Calculate the probability that he catches his bus on the following Friday. 
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(c) Show that, over a long period, the probability that he will catch his bus to 
work is Y5. 


(a) The transition matrix is 

(b) Wednesday probability vector is 

(P(catches), P(late)) = (5,5), 


Friday probability vector =(5,5)12 

'3 

/ 14 11 

(1 3 \l 25 25 | _ ^194 181 

l 22 23 

M5 45 


= (§,§) 


_ /194 18l\ 
~ V375 ’ 375/’ 


3\2 


=> P(catches it on Friday) = 375. 

(c) In the equilibrium state 

(PA ) (2 1) = (pa), and p + q = 1. 

\3 3 / 

2 2 

sP + 3# P> _ 10 i n t ^ e equilibrium state. 

=> p + q = 1, ^ 1V 

Hence, probability (over a long period) that he catches his bus is 


IQ 

19- 


Exercise 1.4 

1 If Joan is late for work, she makes a greater effort to arrive on time the following 
work day. If she arrives on time, she is liable to be less careful the next day. 
Consequently, if she is late one day, the probability that she will be on time the 
next day is If she is on time one day, the probability that she will be late the next 
day is Given that she is on time on Monday, calculate the probability that in the 
same week she will be on time (a) on the Thursday, (b) on the Friday. 

Show that, in the long run, she will be on time \\ times as often as she is late. 

2 If I stay home one evening, the probability that I do not stay home the following 
evening is 0-7, but if I do not stay home one evening, the probability that I do not 
stay home the next evening is 0-6. Write down the transition matrix. Find the 
probability that, over a long period, I will stay home on any evening. 


Miscellaneous Exercise 1 

1 When three marksmen, A, B and C, take part in a shooting contest their in- 
dependent chances of hitting the target are 5 , 5 and respectively. Calculate the 
probability that one, and only one, bullet will hit the target when all three 
marksmen fire at it simultaneously. 

2 The independent probabilities that 3 light bulbs in a car will need replacing within 
a year are ^ and Calculate the probability that, within a year, (a) none, (b) 
at least 1, (c) 1 and only 1 of the 3 light bulbs will need replacing. 

3 Three civil servants, Smith, Jones and Brown, retire on the same day. If at least 2 
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of them are still alive 5 years later, they agree that the survivors will meet. Find, 
to three decimal places, the probability that they will meet as agreed, given that 
the independent probabilities of their each living 5 years after retirement are §, jq, 
7 , respectively. 

4 Let A and B be events with P(A U B) = P(A 0 5) = | and P(A') = f. Find 
P(A), P(B') and P (A O B'). 

5 Three balls are taken at random without replacement from a bag containing 5 
yellow, 4 green and 3 red balls. Find the probability that 

(a) all 3 are of the same colour, 

(b) all 3 are of different colours, 

(c) 2 are of the same colour and the third is of a different colour. 

6 A box A contains a red, b green and c pink cards. A similar box B contains p red, 
q green and r pink cards. A boy tosses a coin and, if it shows ‘heads’, he chooses 2 
cards at random from box A but if it shows ‘tails’, he chooses 2 cards at random 
from box B. If there are 50 cards in each box and drawing is without replacement, 
show that the probability that both cards drawn are of the same colour is 

— (a 2 + b 2 + c 2 + p 2 + q 2 + r 2 ) - — . 

4900 49 

7 It is known that 0*03% of the population suffer from a particular disease. A test to 
discover the disease shows a positive reaction for 90% of people suffering from 
the disease, and also for 1-5% of people not suffering from it. A randomly 
selected person shows a positive reaction to the test. Find, to three decimal 
places, the probability that that person does have the disease. 

8 A and B are two events. Show that P(A) lies between P(A|£) and P(A|£'). 

9 A battery contains only two breeds of hens, X and Y; 75% of the egg production 
is from hens of breed X. Of the eggs laid by the X hens, 25% are size 1, 55% are 
size 2 and the remainder size 3. For the Y hens, the corresponding proportions are 
35%, 40% and 25%. Egg colour (brown or white) is independent of size in each 
breed; 40% of X eggs and 30% of Y eggs are brown. Find 

(a) the probability that an egg laid by a Y hen is size 1 and brown, 

(b) the probability that an egg is size 1 and white, 

(c) the probability that a white egg is size 1 , 

(d) which size grade contains the smallest proportion of white eggs. 

10 A survey of 500 graduates studying one or more courses in mathematics, physics 
and chemistry gave the following numbers of students attending classes in the 
indicated subjects: 

mathematics 256 both mathematics and physics 80 

physics 262 both physics and chemistry 165 

chemistry 340 both mathematics and chemistry 158 

Find the probability that a student selected at random from the group takes all 
three subjects. 

11 If the sun shines one day, the probability that it shines the next day is but if it 
does not shine, the probability that it shines the next day is 5. The sun does not 
shine on Monday. Calculate the probability that it will shine on Wednesday of the 
same week. 

12 The probability of a darts team winning a match is 0*5 and of drawing is 0*3, if the 
previous match was won. If the previous match was drawn, the probability of 
winning is 0-3 and of drawing is 0*4. If the previous match was lost, the probability 
of winning is 0T and of drawing is 0*3. Find the transition matrix, and hence find 
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the probabilities of the team winning, of drawing and of losing any particular darts 
match in the distant future, if the probabilities remain the same. 

13 Given that events A and B are independent and that P(A) = \ and P(A H 5 ) = |, 
find P (£), P(£|A), and P (A U B). 

14 Given that P(A) = §, P(£|A) = §, P(£|A') = \ , determine 

(a) P (B 0 A), (b) P(£ 0 A'). 

15 A bag contains 18 sweets, 3 of which are yellow and 15 of which are green. Sweets 
are drawn at random from this bag, one at a time and without replacement, until 
the first yellow sweet appears. Calculate the probability that this occurs on the 
fifth drawing. 

16 Each of three identical boxes, X, Y and Z, has two drawers. Box X does not 
contain any coins. Box Y contains one coin only. Box Z contains one coin in each 
drawer. A box is chosen at random and a drawer is opened and found to be 
empty. Find the probability that a coin will be found 

(a) if the other drawer in the same box is opened, 

(b) if one of the other two boxes is chosen at random and a drawer is opened. 

17 In a tasting trial, two pieces of cheddar cheese — one processed cheese, the other 
farmhouse cheese — are tasted and the taster is asked to identify the farmhouse 
cheese. Assuming that the taster cannot distinguish between the two types of 
cheese, 

(a) find the probability that the taster has at least 4 successes in 5 trials; 

(b) find the smallest value for n that will ensure that, in n trials, there is a pro- 
bability of at least 0*95 of the taster obtaining at least one success; 

(c) find the smallest value for n that will ensure that, in n trials, there is a 
probability of at most 0*5 of the taster obtaining fewer than 2 successes. 

18 Three dice are to be thrown and the total score, S , is to be recorded. Find the 
probability that 

(a) S will be either 8 or 9, 

(b) 5 11, 

(c) S will be odd. 

Find the probability that only one ‘6’ will be thrown. 
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2 Probability distributions 


2.1 Random variables 

In Chapter 1 we saw that, for some experiments, the sample space (that is, the 
set of all possible outcomes) is a set of numerical values, whereas in others the 
outcomes are non-numerical. For example, the experiment of throwing a fair 
die and noting the number obtained has numerical outcomes, whereas the 
experiment of tossing a coin once has just two outcomes, ‘head’ or ‘tail’, 
forming the sample space. However, even in this latter example we could 
assign numbers 1 and 0 to ‘head’ and ‘tail’, respectively and so make each of 
the outcomes a numerical quantity. 

In probability work we are often interested in real numbers which either 
represent, or are assigned to, the outcomes of chance experiments; thus, 
every element (outcome) of the sample space S is associated with a unique 
numerical value x. This means that a function X is defined over all the points 
of the sample space, and x is the value of the function X at that particular 
element of S. Such a function X is called a random variable , and X may be 
discrete (if the number of possible outcomes is finite or countably infinite) or 
continuous (if X may assume all values in some interval a < x < b, where a , b 
may be infinite). 


2.2 Discrete probability distributions 

We have discussed in the previous chapter methods of finding the probability 
P of a certain outcome occurring. If we denote by p(x) the probability that the 
discrete random variable X takes the value x , then p(x), where P(X = x) = 
p(x), is called the probability function of X. The points given by (je^p^)), 
(x 2 ,p(jc 2 )), . . . , for all the elements of the sample space (say, n elements), can 
be plotted on a diagram (Fig. 2.1). This provides a visual illustration of the 
probability distribution. Since the sum of all the possible outcomes is 1 (cer- 
tainty), we have 

n 

X pW = i- 

r= 1 

We also define the distribution function F(x) by the equation 

F(*r) = ?(X^x r ) = XpOO- 

/= 1 
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Fig. 2.1 


The function F(x) is the cumulative probability function of X from the lowest 
value of X , x x , up to and including the value x r . 

Example 1 The probability function of a discrete random variable X is given 
by 

p(*) = X = 1,2, .... 


(a) Find the value of k. 

(b) Find P(X 3). 


(a) 


Ep(x) = 1 






= k ; 


since this is an infinite geometric progression with a = r = \ in the usual 
notation, 

=> /c = 1. 

(b) P(* « 3) = p(l) + p(2) + p(3) 

_ JL JL J_ _ ]_ 

2 + 2 2 + 2 3 “ 8 ' 


The expectation or expected mean value , E(X), of a discrete random 
variable X is defined as 


E(X) = Sjcp(jc) 
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summed over all the possible values of X. That is, it is the sum of the products 
of all the possible values of the random variable and their respective 
probabilities. Sometimes we use the symbol /x (Greek mu) for E(X). 

Example 2 A collector for charity asks you to put a coin in his tin. In your 
pocket are three lp, two 2p, four 5p and one lOp pieces. If you put your hand 
into your pocket and pull out a single coin at random to put in his tin, find the 
expected amount that you will thus give. 

P(X = 1) = i, ?(X = 2) = !, ?(X = 5) = l ?(X = 10) = 4 

Hence, 

E(X) = (lx^ + 2x^ + 5x§ + 10x^) pence = 3^ pence. 

If we considered a very large number of repeated trials of an experiment, 
then the average value of the random variable X for these trials would be very 
close to the expected mean. The expected mean, as we see above, may be an 
impossible value for any given trial, since, in the case of Example 2, on any 
given trial the coin donated will be either a lp, 2p, 5p or lOp coin. E(X) is the 
value found when we calculate the mean of alllhe possible values of X in this 
experiment; it is the mean of the probability distribution of X. 

We can show that, if X is a discrete random variable and a and b are 
constants, then 

E (aX + b) = aE(X) 4- b. 

Proof Whenever X takes the value x, aX + b takes the value ax 4- b, 

=^> P (aX 4- b = ax 4- b) = P(X = x) = p(x), 

since a , b are constants. 

Thus 


E (aX + b) = E (ax + b).p(ax + b ) 

= E (ax + b).p(x) 

= flExp(x) 4- b Ep(x) 

= aE(X ) 4- b , because Ep(x) = 1. 


The mean of the sum and difference of two discrete random variables 
Xand Y 

It can be proved that, if X and Y are two discrete random variables, then 

E(X + Y) = E(X) + E(Y), 

E(X - Y) = E(X) - E(Y). 

The proofs of these results are beyond the scope of this text, but we illustrate 
these results by an example. 
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Example 3 Suppose that an experiment consists of tossing a fair die and a 
fair coin, X being the outcome on the die and Y the outcome on the coin. For 
the outcome Y, we denote a head by 1 and a tail by 0. The sample spaces are 

for AT: 1, 2, 3, 4, 5, 6, each with probability \\ 
for Y: 0, 1, each with probability \\ 
for (X, Y): (1,0), (2,0), (3,0), (4,0), (5,0), (6,0), (1,1), (2,1), (3,1), 
(4,1), (5,1), (6,1), each with probability 
E(X +Y) = ^(l + 2 + 3 + 4 + 5 + 6 + 2 + 3 + 4 + 5 + 6 + 7) = g = 4. 
E(2Q = |(1 + 2 + 3 + 4 + 5 + 6) = ^. 

E(Y) = i(0 + 1) = 

Hence, in this case, 

E(X + Y) = E(X) + E(Y). 

Also 

E(^ — Y) = j^(l + 2 + 3 + 4 + 5 + 6 + 0 + 1 + 2 + 3 + 4 + 5) 

= ft = 3 = E(X) - E(Y). 


The expected mean is not the only quantity of interest in a probability dis- 
tribution; we are interested also in how the values of the random variable X 
are spread about the mean. Two distributions could have the same expected 
mean, yet in one the X-values might be closely grouped around the mean, 
whereas in the other the distribution might be widely spread. 

Consider the two distributions 

(i) P(AT = x) = p(x) = 5 for x = 5, 50, 95, 

(ii) P(2T = x) — p(jt) = \ for x = 45, 50, 55. 

For (i), 

E(X) = j(5 + 50 + 95) = 50. 


For (ii), 


E(X) = 1(45 + 50 4- 55) = 50. 


So the expected means are equal but we can see from Fig. 2.2 that the dis- 
tributions are quite different in the way that the values of X are spread about 
the mean value 50. 

There are various measures of spread or dispersion which can be used, the 
one most commonly used being the variance. We define the variance of X , 
written Var(A"), as 

Var(X) = E[X - E(X)f , 

and this is often denoted by the symbol or 2 . The positive square root of 
Var(JY) is called the standard deviation of X (often written as S.D. = a). The 
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Fig. 2.2 

standard deviation is measured in the same units as those in which X is 
measured. 

The evaluation of Var(A') can be simplified by using the result 
Var(X) = E(X 2 ) - p 2 . 


Proof 

Var(AT) = E[X - p] 2 = E[X 2 - 2pX + p 2 ] 

= E(X 2 ) - 2pE(X) + ix 2 , since p is a constant. 

= E(X 2 ) - 2fx 2 + ix 2 , since E(X) = fx, 

= E(X 2 ) - ix 2 , 

= T.x 2 p(x) — ix 2 , the summation being taken over all the values of X. 

Example 4 We return to Example 2, the coins in the pocket problem. We 
found that /x = 3^ pence. We now calculate Var( A’). 

Var(X) = [Ex 2 p(x) - p}\ (pence) 2 

= [l 2 x ^3 + 2 2 X l + 5 2 x l + 10 2 x jL - (22) 2 ] (pence) 2 
= 7-41 (pence) 2 , 
and S.D. « 2-72 pence. 

Had we calculated this using the definition, we would have had 
Var(AT) = [(1 - W x ^ + (2 - fg) 2 x l + (5 - %) 2 x \ + (10 - W x ±} 
= 7-41 (pence) 2 , 

a longer calculation whether it is done with or without a hand calculator. 

Using the same methods as those used for the results for the expected 
mean, the following results are obtained. 

(i) If X is a discrete random variable and a and b are constants, then 
Var (aX + b) = a 2 \ ar(Z). 


Proof 

Var (aX + b) = E (aX + b) 2 - (ap, + b) 2 = E {a 2 X 2 + 2 abX + b 2 ) - (ap + b) 2 
= a 2 E{X 2 ) + 2abE(X) + b 2 - (ap + b) 2 , since a, b are 
constants, 
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= a 2 E(X 2 ) + 2 ab/j, + b 2 - a 2 /jL 2 - 2 abp — b 2 , 

= a 2 [ E(X 2 ) - y}] = a 2 V ar(A). 

Hence, if we take a linear function of the discrete random variable X , that is 
(aX + b ), where a , 6 are constants and X has mean /z, variance a 2 , then 

E (aX + fo) = a/x + fo, Var(flA + b) = a 2 (i 2 . 

(ii) If X and 7 are independent discrete random variables, then 

Var(A ± Y) = Var(JT) + Var(Y). 

Note here that, for the variances, the result is the addition of Var(2T) and 
Var( Y) when we are taking the variance of the sum of X and Y and also when 
we are taking the variance of the difference of X and Y. 

As in the case of E(2T ± Y), it is too difficult for us to justify these results in 
this text. Again we illustrate the results by an example. To return to Example 
3, on the die and the coin, X representing the outcome on the die and Y the 
outcome on the coin, the two random variables X and Y are independent in 
this case. 

VarpO = \[1 2 + 2 2 + 3 2 + 4 2 + 5 2 + 6 2 ] - @ 2 = 2%. 

Var(Y) = 1[0 2 + l 2 ] - (l) 2 = 

Var(X + Y) = ^[2 2 + 3 2 + 4 2 + 5 2 + 6 2 + 7 2 + l 2 + 2 2 + 3 2 + 4 2 + 5 2 + 6 2 ] 
- (4) 2 = t 

Var(X - Y) = ^[l 2 + 2 2 + 3 2 + 4 2 + 5 2 + 6 2 + l 2 + 2 2 + 3 2 + 4 2 + 5 2 ] - (3) 2 

— 19 
~ 6 * 

Hence, 


Var(X + Y) = Var(^r - Y) = Var(A) + Var(Y). 

Exercise 2.2 

1 Find, to two decimal places, the expectation E(X), and the variance Var(A), for 
each of the following distributions: 


(a) 


X = X 

2 

4 

7 

pW 

l 

3 

1 

2 

l 

6 


X = X 

-3 

-1 

2 

4 

p(*) 

0-4 

0-1 

0-2 

0-3 


2 A distribution of positive integers has probability function 

P(r) = 3lC) f ° r r=1 ’ 2 ’ 3 ’ 4 ’ 5 ’ 

p(r) = 0 for r > 5. 
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Prove that the expected mean value is — and that the variance is — 7 —. 

31 961 

3 In a given business venture, a person can make a profit of £10 000 with probability 
0-7, or take a loss of £6000 with probability 0*3. Find the person’s expected gain. 

4 Two boys are playing a game in which one boy tosses two fair coins. If he gets two 
‘heads’, the other boy pays him 3p, if he gets only one ‘head’, the other boy pays 
him 2p, but if he gets no ‘heads’ he has to pay the other boy 5p. Find the expected 
winnings, on a toss, of the boy tossing the coins. 

5 A fair coin is tossed until either one ‘head’ or four ‘tails’ occur. Find the expected 
number of tosses of the coin. 


2.3 Continuous probability distributions 

If X is a continuous random variable, we define f(x), the probability density 
function (pdf) as the function satisfying the conditions 

(i) f(x) 25 0 for all x e S, the sample space, and 

(ii) J f(x)dx = 1. 


r 

J 1 


means integration over the sample space. 


Further, we define, for any x 0 < x\ in 5, 


P(x 0 < X < x x ) = 


1 


xi 

f(x)dx, 


*0 


and, hence, P(x 0 < X < x x ) represents the area of the region under the graph 
of the probability density function f(x) between the limits x = x 0 and x — x x . 
Whenever X takes values only in some finite interval, we may assume that the 
pdf is zero elsewhere, so that we can write 


* 


+ OC 

f(x)dx = 1, 


as a general result. Using this convention of the pdf being zero where it is not 
defined, we can then define F(x), the probability ( cumulative ) distribution 
function by the relation 

f *0 

F(* 0 ) - P(* ^ * 0 ) = f(*)dx. 


This represents the area of the region under the graph of the pdf, f(x), from 
x = —00 to x = x 0 . The function F(x) obviously increases from zero at the 
bottom of the range to unity at the top of the range. 

There is one special point which we must notice when X is a continuous 
random variable which does not arise when we deal with a discrete X. If we 
consider 

P(x 0 < X < xf) = I f(x)dx, 

J x 0 
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and let x 0 approach x 1 , then, in the limit, the integral on the right-hand side 
becomes zero. This means that, whereas we can speak of the probability that 
a discrete random variable is exactly jc 0 , there is no equivalent to this in the 
continuous random variable case, and we can speak only of the probability of 
X lying in a given interval. Certainly f(x 0 ) does not represent the probability 
that X takes the value x 0 . From the definition of F(x) we see that F(x 0 ) is the 
integral of f(x) between the limits x = — and x = x 0 , and hence, 

«w - 

ax 


Example 5 The pdf of the continuous random variable X is given by 

f(x) = kx 2 , 0 ^ x ^ 1, 
f(x) = 0, elsewhere. 

Find (a) the value of k, 

(b) P(* ^ i), 

(c) P(i < X < £). 

Illustrate the probability distribution by a sketch. 

(a) J f(x)dx = 1 => I kx 2 dx = -^-=l^>k = 3. 

J -oc Jo 3 

r 1/2 r 1/2 i 

(b) F(X ^ \) = f(x)dx = 3x 2 dx = — . 

J -oc Jo O 
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(c) P(l<X< i) 


■/ 


1/2 n 

3x 2 dx = 

1/4 


64' 


The probability distribution is shown in Fig 2.3. 


Just as we did for the discrete case, we can define E(JQ and Var(JQ when X 
is a continuous random variable, but now our definitions involve integrals 
rather than summation signs. We define 

IX = E(X) = J xf(x)dx, 

the expected mean value of X , and 

cr 2 = Var(X) = J (x - p) 2 f(x)dx , 

the variance of X. 

As before, we can assist in evaluating Var(Jf) by using the result 
Var(AT) = J x 2 f(x)dx - p 2 . 

Proof 


r 


(x - p) 2 f(x)dx = j x 2 f(x)dx - 2pj *f(*)d* + p 2 ^ f(x)ck, 
x 2 f(x)dx — 2p 2 4- p 2 , 
x 2 f(x)dx — p 2 , 

r+oc r +oc 

since J xf(x)cU = p, I f(x)dx = 1, 
or 

r+oc 

Var(Jf) = x 2 f(x)dx - p 2 = E(X 2 ) - p 2 . 

J —00 


-r. 

-j: 


Example 6 A random variable X has cumulative distribution function 

F(jc) = 0, x ^ 0, 

F(x) = fcc 4 , 0 < x ^ 2, 

F(x) = 1, x > 2. 

Find (a) f(jt), the pdf, 

(b) E(X), 

(c) Var(Z). 

Illustrate the probability density function by a sketch. 


Probability distributions 29 



v 7 v 7 dx 

F(jc) must be unity at x = 2, 


=> 16 k = 1, k = — , 
lo 


giving 


f(x) = — , o « x « 2, 
f(x) = 0, elsewhere. 

(b) E(JT) - J_ rfWd* = = [^] o - 


(c) Va«JQ = JV - (}) 2 - [£]’ - (}) 2 - 

The probability density function is shown in Fig. 2.4. 


64 

24 


64 _ _ 8 _ 
25 “ 75’ 



Fig. 2.4 


Example 7 The continuous random variable X has pdf 

f(x) = 2x19, 0 ^ jc ^ 3, 
f(jc) = 0, elsewhere. 

(a) If two independent determinations of X are made, find the probability 
that both of them will be greater than 2. 

(b) If three independent determinations of X are made, find the probability 
that two and only two of these are greater than 2. 


30 Probability 



= 1, it follows that 


Jo 


f 3 2x 

[Note that since f(x) ^ 0 and — -dx = 

J o 9 

f(x) = 2x/9, 0 ^ x ^ 3, does indeed represent a pdf.] 

(a) P(* > 2) = j*-ydx = j. 

5 5 25 

?(X x and X 2 > 2) = — x — = — , since the readings are independent. 

(b) P(X > 2) = 1 - J = J. 


P(X U X 2 > 2, Z 3 > 2) + P(X U X 3 > 2, X 2 > 2) + ?(X 2 , X 3 >2,X x > 2) 

5 5 4 . 100 

= — x — x — x3 = — . 

9 9 9 243 


Exercise 2.3 

1 Given that X is a continuous random variable with pdf 

f(x) = \x + k, for 0 ^ x 2, 
f(jc) = 0, elsewhere, 

find the value of k. Find also P(j ^ X ^ 1). 

2 The probability density function of a continuous random variable X is given by 

f(jc) = x(x - 1)(* - 3) for 0 *£ x *£ 1, 
f(x) = k for 1 < x ^ 3, 

f(x) = 0 otherwise, 

where A: is a suitable constant. Find the value of k. Find E(2Q. Find also the 
probability that X is less than or equal to E(X). 

3 Find the distribution function F(x) of the continuous random variable X whose 
probability density function is given by 

f(x) = x/2 
f(x) = 1/2 
f(x) = (3 - x)/2 
f(x) = 0 

and illustrate F(x) by a sketch. 

4 The distribution function of the continuous random variable X is given by 

4 

F(x) = 1 — — for x > 2, 
jr 

F(x) = 0 elsewhere. 

Find (a) P(X > 4), (b) the pdf of X. 

5 The time in minutes that the Inter-City train between two cities is early or late in 
arriving is a random variable with pdf given by 


for 0 ^ x 1, 
for 1 < x ^ 2, 
for 2 < x ^ 3, 
elsewhere, 
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for -4 < x < +4, 
elsewhere, 


f(x) = 


3(16 — x 2 ) 
256 


f(x) = 0 


where negative values of x indicate the train arriving early and positive values 
indicate the train arriving late. Find the probability that the train will arrive 

(a) at least 2 minutes late, 

(b) at least 1 minute early, 

(c) between 1 and 3 minutes late. 


Miscellaneous Exercise 2 

1 An integer takes the value r with probability Xr, X being a contant, for 
0 < r 3n; the probability is zero elsewhere. Find the value of X and show 
that the expected mean value is (6 n + l)/3. Show also that the variance is 
(3 n + 2) (3/i - 1)/18. 

n 

[Note that X r — n(n + l)/2, 

r= 1 
n 

X r 2 = n(n + 1)(2 n + l)/6, 

r= 1 

S r 3 = n 2 (n + l) 2 /4.] 

r= 1 

2 When N people are inoculated it is known that each individual may experience an 
adverse reaction. Denote by X the number of people who react adversely. 
Assuming that the probability distribution of X is 

k 

P (X = r) = — for r = 0, 1, . . ., A, 

where k is a positive constant, find k in terms of N. Find also, in terms of k and n , 
the probability that at least n of the people inoculated react adversely. 

Show that, when N = 5, the probability of there being at least one adverse 
reaction is approximately 0-492. 

3 A random variable R takes the integer value r with probability P(r) defined by 

P(r) = \r for r = 1,2, 3, 4, 5, 

P (/) = X(ll - r) for r = 6,7, 8, 

P(r) = 0 for all other r. 

Find the value of the constant X. Find also E (R) and Var (R). Represent the 
probability distribution of R on a suitable diagram. Write down the mean and 
variance of (a) 2R - 5, (b) 5 Ri - 4 R 2 , where Ri and R 2 are independent 
observations of R. 

4 The discrete random variable X has a probability function P(X) defined by 

P(0) = P(9) = 

P(l) = P(6) = i 

P(4) = 2 . 

P(2Q = 0 elsewhere. 

Draw a sketch to illustrate this probability distribution and find E(A) and Var (A). 
Find also E(Y) and Var(Y) when Y = 5X - 3. 
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5 The probability density function of the random variable X is given by 

f(jc) = 0 for x < 0, 

f(jc) = cx for 0 ^ x 1, 

f(jc) = 0 for x > 1, 

where c is a positive constant. Find c, and evaluate E(X). 

6 The probability density function f of the random variable X is given by 

f(jc) = Ax 2 ( 1 - x) for 0 ^ x ^ 1, 
f(x) = 0 elsewhere. 

(a) Evaluate A. 

(b) Sketch the graph of f(jc). 

(c) Find E(X) and Var(Z). 

(d) Calculate ?(X =s 3). 

7 The probability density function for the random variable X is given by 

f(jc) = k sin jc for 0 ^ jc ^ 77, 

f(x) = 0 for x < 0 and x > n. 

Find (a) the value of k , 

(b) P(* ^ 2), 

(c) Var(*). 

8 The random variable X has probability density function 

f(jc) = 1 for 0 ^ jc ^ k, 

f(jc) = | for k < x ^ 2, 

f(jc) = 0 elsewhere. 

Find k and calculate E(A) and Var(2f). Sketch the graph of the distribution 
function for this distribution. 

9 A random variable X has probability density function f given by 

f(jc) = cx k for 0 ^ jc ^ 1, 
f(jc) = 0 otherwise, 

where c and k are constants. Find an expression, in terms of k , for E(2Q. 

10 A random variable X takes only values jc such that k ^ x ^ 3, and, in this range, 
P(2T jc) = \(jc — 1)(jc + 3)(5 — jc). Explain why k = 1. Calculate the value of X 
and find ?(X ^ 2). 

11 The pdf f(jc) of the continuous random variable X is given by 

f(jc) = b(cx — jc 2 ) for 0 ^ jc ^ 2, 
f(jc) = 0 for jc < 0, jc > 2, 

where b and c are positive constants. Show that c ^ 2 and that b = 3/(6c - 8). 

Given that E(A) = f , calculate the values of b and c. Sketch the graph of f(jc) 
and find Var(Z). 

12 The random variable X has probability density function f(jc) given by 

f(jc) = cjc(jc - 2) 2 for 0 ^ jc ^ 2, 
f(jc) = 0 elsewhere, 

where c is a constant. Find the value of c. 

Find E(X) and Var(2f). Show that P(1 ^ X ^ 2) is approximately equal to 0*31. 
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13 The probability density function f(x) for the random variable X is defined by 

f(*) = k { 3 + 2x) for 2 ^ x ^ 4, 
f(%) = 0 otherwise. 

Determine the value of k and sketch the graph of f(x). 

Calculate E(X). 

Sketch the distribution function F(x). 

Calculate P(2*5 ^ X ^ 4), and find the value x 0 such that P(X ^ * 0 ) = 

14 The continuous random variable X can assume values only between 0 and 4, and 
its pdf f(x) is given by f(x) = k sin(7rx/4) for 0 ^ x ^ 4. Find the values of k and of 
E(X). Show that Var(X) = 4[1 - (8/tt 2 )]. 

15 An ironmonger is supplied with paraffin once a week. The weekly demand, x 
hundred litres, has the continuous probability density function f(x), where 
f(x) = 6(1 - jc) 5 for 0 ^ x ^ 1. Find, to two decimal places, the required capacity 
of the paraffin tank if the probability that it will be exhausted in a given week 
does not exceed 0*01. 

16 A point X is taken at random in a line PQ of length 2/, all positions of the point 
being equally likely. Find the expected value of the product PX. XQ and show 
that the probability that this product exceeds l 2 ! A is y/3/2. 
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3 Some discrete probability distributions 


3.1 Introduction 

There are some theoretical probability distributions that occur sufficiently 
frequently in statistical work to warrant individual consideration in our text. 
In this chapter we discuss four discrete probability distributions: uniform, 
binomial, geometric and Poisson. 


3.2 The discrete uniform distribution 

A discrete random variable X , whose probability function p(r) is given by 

p(r) = Ilk for r = 1, 2, . . . , k, 
p(r) = 0 otherwise, 

where A: is a constant integer, has a discrete uniform distribution. From the 
sketch of the distribution (Fig. 3.1) we can see why a uniform distribution is 
sometimes referred to as a rectangular distribution. We can find the expected 
mean of this distribution, 


E(X) = 


1 


+ T k 


= xt 1 + 2 + 


+ *] = 


k(k + 1) 
k. 2 


(k + 1 ) 

2 ’ 


since the sum of the first k natural numbers is equal to [k(k + l)]/2. When we 


p (r) 


Mk 


l- ■ i i i i 

0 1 2 3 4 5 


k r 


Fig. 3.1 
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look at the shape of the distribution as shown in Fig. 3.1, this is an obvious 
result for the mean. 

The variance 


Var(2Q = \{\ 2 + 2 2 + ■■■ + k 2 ) - [E(X)] 2 

K 

1 k(k + X)(2k + 1) (k + l) 2 
= T' 6 4 ’ 


using the result that the sum of the squares of the first k natural numbers is 
[k(k + 1)(2 k + l)]/6, 


=> Var(JT) = 


(2k 2 + 3k + 1) 
6 


(k 2 + 2k + 1) k 2 — 1 
4 = 12 


Example 1 An unbiased die is to be tossed repeatedly. Denoting by X the 
number which is the outcome of a throw, write down the probability function 
of X and find the mean and variance of X. 

Since the die is unbiased, each of the numbers 1 to 6 has an equal chance of 
occurring, giving 


P(X = r) = p(r) r = 1, 2, 3, 4, 5, 6. 


Here we have a discrete uniform distribution with k = g. 


E(X) = 


6 + 1 
2 


7_ 

2 ’ 


Var(A0 = 


36 - 1 
12 


35 

12 ' 


Example 2 A boy tosses a fair die. If a non-prime number occurs, he will 
win that number of pence, but if a prime number occurs, he will forfeit that 
number of pence. Write down the possible outcomes (gain or loss) of a toss, 
with their respective probabilities, and hence find his expected gain or loss on 
one toss. 


Let X pence be his gain or loss. Then X takes the values 


— 1, —2, —3, +4, —5, +6, each with probability of 

since 1, 2, 3, 5 are prime; 4, 6 are not. 

. . 1 2 3 4 5 6 1 

His expected loss is \ pence. 


6 ’ 
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Exercise 3.2 

1 A discrete uniform distribution is defined by P(X — r) = 1/k for r = 0, 1, . . . , 
(k — 1). Find the mean of the distribution and show that the variance is equal to 
(k 2 - 1)/12. 

2 A girl is playing a game in which she tosses a fair die. If an even number occurs, 
she wins twice that number of pence, but if an odd number occurs, she forfeits 
three times that number of pence. Find her expected gain on one toss. 

3 The discrete random variable X is known to be uniformly distributed over the set 
of consecutive integers { a , a + 1, ..., b}. Show that 

E(*) = ^p, Var W = (b ~ ^ U + 2) . 

Given that E(2Q = 6 and Var(A") = 2, find the values of a and b. 


3.3 The binomial distribution 

Suppose that we perform exactly n times an experiment which has only two 
possible outcomes, E or E f . We say that we have conducted n trials of the 
experiment. Given that each trial is independent of all the others and that the 
probability of the outcome £, P(£), is constant and equal to p throughout the 
n trials, then we can find the probability of getting 0, 1, 2, . . . , r, . . . , n, 
outcomes E in the n trials. Let the variable X represent the number of 
‘successes’, or outcomes E , in the n trials. Then 


P(X = 0) = P (£' H E' H E . . . n times) 

= P(E') P(E') . . . n times, since the trials are independent, 

= (i - pf. 

For P(A" = 1) we must have the outcome E occurring once and the outcome 
E' occurring ( n — 1) times. However, the outcome E could occur on any one 
of the n trials, giving n ways in which the total of n outcomes could be 
arranged. Hence, 

P(X = 1) = n{ 1 - p) n ~ l p. 


To generalise for P(X = r), where r ^ n, we now have the outcome E occur- 
ring r times and the outcome E f occurring ( n - r) times. The r outcomes E 
could occur on any r of the n trials. Using the ideas of combinations, we can 

choose r places out of n in ^^ways, where (^j = : . 


Thus, 


?(X = r) 



P) H -Y, r = 0 , 1 , 2 , 


, n. 


These probabilities are also the consecutive (n + 1) terms of the binomial 
expansion of [(1 — p) + p ] n , confirming that the total sum of the probabilities 
is l n = 1 (certainty). 
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The distribution of X is called the binomial distribution , and we define it 
formally thus: The discrete random variable X having a probability function 

P(^T = r) = P(r) = ^^(1 “ P) n ~ r p r > where 0 ^ p ^ 1, and r- 0, 1, 2, . . . , n, 

is said to have a binomial distribution, B(n,p). 

We can write this X ~ B(n,p), to be read as X is distributed binomially, 
with n independent trials and p the constant probability of ‘success’. 


The mean and variance of B(#? f p) 

e(x) = j>(")(i - P) n ~y 


= 2 - 


n\ 


r=\r\ (n - r)\ 
contributes zero to the sum, 

(» - 1)! 


(1 - p) n r p r , since the term when r = 0 


= np2 
= np 2 


r=\(r - l)![n - 1 - (r - 1)]! 
N N ' 

(1 - p) N - R p R , 


(1 - p)<"- 


l)-(r-l)„r-l 


R-o/?!(Af - R)l 

writing N = n — 1, R = r — lin the summation, 

= np[(l -p)+ pf 
— np. 

We could have shortened this work a little by using the notation q = 1 — p, 
and we use this when finding Var(AT). 


-y, 


E(X 2 ) = 'Zsyiy-y 
= t[r(r - 1) + 


since the term for which r = 0 contributes zero to the summation, 

q n ~y 




r=l 


(/i - r)!r! 


r=l \r 


r| |<f y 


= - 1)P ! (, - W- O' + ^ 


since the term for which r = 1 contributes zero to the first summation, and we 
have already shown, when finding E(X ), that the second summation is equal 
to np. 
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If we put N = n — 2, R = r - 2 in the first summation, we have 

E(Jf 2 ) - „(„ - 1)^2 q N ~ R p R + np 

= n(n — 1 )p 2 [q + p] N + np 
= n(n — 1 )p 2 + np, since q 4- p = 1, 

Var(20 = rc(/i — 1 )p 2 + np — n 2 p 2 
= "Pi 1 - P) = */>?• 


A useful equality 

In problems on the binomial distribution where we have to calculate more 
than one probability, a useful equality connects consecutive terms in a binom- 
ial expansion. It is 


P (r + 1) = 


n — r 


r + 1 ’ 1 


P(r). 


For example, the value of P(3) can be obtained from that of P{ 2) by using the 
equality 


P(3) = ~~z~— ■ ~r^~T ■ P(2) i 

3 1 — p 

where n and p will be given in the problem. 

Example 3 It is known that 20% of the articles produced by a certain 
machine are defective. Given that a random sample of 10 articles produced by 
the machine is taken, find the probability that the sample will contain 

(a) one defective article, 

(b) less than two defective articles, 

(c) at least three defective articles. 

Let X be the number of defectives in the sample of 10 articles. Then X will 
have a binomial distribution with n = 10, and p = 0*2, since 20% of the 
articles are defective and we are calling the occurrence of a defective article a 
‘success’. We can write 


X ~ B(10, 0-2). 


(a) 

P(X = 

i) = 

- 0-2) 9 (0-2) - 0-268. 

(b) 

P(X < 

2) = 

P(0) + P(l) 



= 

(0-8) 10 + 10(0-8) 9 (0-2) * 0-376. 

(c) 

P(Z 2S 

3) = 

P(3) + P(4) + • • • + P(10), 

but 

this will involve many calculations. Instead, we will use 


A\ 

>< 

ST 

3) = 

1 - P(Z < 3), 


Some discrete probability distributions 39 



since the sum of all the probabilities is 1, 

= 1 - P(0) - P(l) - P(2) 

= 1 - (0-8) 10 - 10(0-8) 9 (0-2) - (^(O-S) 8 ^) 2 

= 0-322. 


Example 4 A fair die is to be tossed 1500 times. Given that the random 
variable X represents the number of times that a ‘4’ occurs, find the mean and 
the variance of X. Write down the standard deviation of X. 


Here we have X ~ B(1500, g), since the probability of getting a ‘4’ is \ on each 
toss and there are 1500 independent tosses of the die. 


Mean, E(X) = np = 1500/6 = 250. 

Variance, Var(X) = np( 1 — p) = ^ ^ = 2O83. 


Standard deviation = V( var i ance ) = 


25V3 
3 ’ 


Example 5 In a pole-vault competition, in order to enter the competition, 
each person is allowed not more than 3 attempts to jump once successfully a 
certain qualifying height. Given that p (a constant) is the probability of a 
person failing to jump that height successfully at any one attempt, find, in 
terms of p, the probability that a person will qualify to enter the competition. 

Given that people attempt to qualify to enter the competition in teams of 6 
and that a team is considered as qualified to enter if at least 5 of the team have 
individually qualified, find, in terms of p, the probability of a team qualifying. 

Let E represent the outcome that a person qualifies; that is, that a successful 
jump is achieved on the first or, if necessary, the second or third attempts. 
The probability of a successful jump is (1 — p). Then 

P (E) = (1 -p)+ p( 1 -p)+ p\ 1 - p) = 1 - p 3 - 

This is the probability that a person qualifies, and so the probability that a 
person does not qualify, P(£'), is p 3 . This is an obvious result, since, in order 
not to qualify, a person must fail on all three attempts. 

For a team of 6, we must find the probability of at least 5 of them indivi- 
dually qualifying — that is, either 5 or 6 qualifying. The distribution of the 
number of people in the team who individually qualify is B(6, 1 — p 3 ). 

P(5) + P(6) = Qp 3 (l - p 3 ) 5 + (1 - p 3 ) 6 

= 6p 3 (l - p 3 ) 5 + (1 - p 3 ) 6 
= (1 - p 3 ) 5 (6p 3 + 1 - p 3 ) 

= (1 - p 3 ) 5 (l + 5 p 3 ). 
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Exercise 3.3 

1 In a large collection of seeds, 3 out of 4 are lupins and the rest are weeds. If they 
are planted at random, find, to two decimal places, the probability that in a row of 
5 plants 

(a) all are lupins, 

(b) at least 4 are lupins. 

2 Two girls, Alice and Brenda, play a game in which Alice should win 6 games to 
every 5 won by Brenda. If they play 4 games, find, to two decimal places, the 
probability that Alice will win at least 2 games. 

3 After batteries have been stored in a certain climate, it is found that an average of 
one-fifth of them are flat. A shopkeeper buys 3 batteries. Find the probability that 
exactly 2 of them are not flat. 

A man buying batteries wishes there to be a probability of at least 0-95 that at 
least 2 of them are not flat. Find whether 4 will be enough for him to buy. 

4 A box contains 12 black counters and 8 white counters. Calculate the probability 
that a random sample of 5 counters drawn together from the box will contain at 
least 4 black counters. 

5 On average rain falls on 12 days in every 30. Find the probability 

(a) that the first 4 days of a given week will be fine and the remainder wet, 

(b) that rain will fall on just 4 days of a given week. 


3.4 The geometric distribution 

A discrete random variable X with probability function 

P(A" = r) = (1 - p) r p where 0 ^ p ^ 1 and r = 0, 1, 2, . . . , 

is said to have a geometric distribution with parameter p. 

This distribution can arise in an experiment which fulfils the conditions 
which are required to be satisfied for the binomial distribution B (n,p) except 
that, instead of counting the number of ‘successes’ which occur in the n trials, 
as we did for B (n,p), we carry on with the trials only until we get one 
‘success’. Given that X is defined as the number of ‘failures’ we get before we 
get a ‘success’, then 

?(X = r) = (1 - pYp, 

since the probability of ‘failure’ is (1 - p ). Thus, X has a geometric 
distribution. 

This distribution can be shown to have a mean E(X) = (1 - p)/p, and 
variance Var(A") = (1 - p)/p 2 . You are asked to obtain these results in 
Exercise 3.4, Nos. 1 and 5. 


Example 6 A die is to be thrown until a ‘3’ is obtained. Given that X is the 
total number of throws needed to obtain the ‘3’, write down the probability 
function of X. Find 

(a) the most probable number of throws, 
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(b) the mean number of throws, 

(c) the least value n of X such that the probability that a ‘3’ has been thrown 
on or before the nth toss is greater than 0*5. 


P(X = r) = (Ir 1 (£) forr = 1,2, .... 

(a) The greatest value of P(X = r ) over the sample space occurs when r — 1 
(that is, ‘no failures’), giving P(l) = 

(b) The mean number of failures is (1 — p)lp , and p = g, 

=> mean = 5 failures — that is, 6 tosses altogether. 

(c) P(X *n) = &(§)° + (I ) 1 + (l) 2 + • • • + (l)"- 1 ] 

1 [1 ~ (I)"] 

6 (1 - ’ 


using the result for a geometric progression 


1 - r ’ 

=> P(X *£ n) = 1 - (§)". 

We require 

1 - (§)" > 0-5 

=> (ir < o-5 

=> 2 < n lg(l-2) > lg2, 

n > 3-802. 

The least value of n is, therefore, 4, since n must be an integer. 


Exercise 3.4 

1 The probability distribution of a random variable X is geometric; that is, 
V{X = r) = (1 - p) r p, for r = 0, 1, 2, . . . , where 0 < p < 1. Given that 


i 


i 

(i - p) 2 ’ 


show that E(2f) = (1 - p)/p. 

2 A random variable X is distributed geometrically. Given that P(2f = 0) = |, 
illustrate on graph paper the distribution for 0 X ^ 4. 

3 In a game, the player throws a coin until a ‘head’ is obtained and he then receives 
from the bank £2 n , where n is the number of throws. Find the probability that he 
receives (a) £16, (b) more than £16. Find the probability that, if the bank holds 
£10 5 , the player will break the bank in a single play. 

4 If the probability is 0-7 that a learner driver will pass the driving test on any given 
attempt, find, to two significant figures, the probability that a person will pass his 
test on his fourth attempt. (You may assume that the attempts are all indepen- 
dent.) 
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5 Show that the variance of the geometric distribution with probability function 
P(X = R) = (1 - p) R p for R = 0, 1, 2, 
is equal to (1 - p)/p 2 . 

[Hint: Differentiate twice, with respect to p , both sides of the equation for the 
sum of the infinite geometric series 

00 1 

2 (1 - p) R = — , where 0 < p < 1.] 
r= o P 


3.5 The Poisson distribution 

A discrete random variable X with probability function 

a r e ~ * 

P (X=r) = r = 0, 1, 2, ..., 

where p is a constant, is said to have a Poisson distribution with parameter p. 
This distribution was first given by S. D. Poisson in 1837. He derived it as the 
limit of a binomial distribution B (n,p) when n tends to infinity and p tends to 
0 in such a manner that np remains constant and equal to p. The derivation of 
this limit is not within the scope of this text but we will show how this limiting 
distribution can be used. 

If we need to calculate probabilities for B (n,p) when n is large, the 
arithmetic is very lengthy, although tables of binomial coefficients and of 
cumulative binomial probabilities can be obtained. However, in general, the 
Poisson distribution will provide, with less arithmetic, a good approximation 
to the binomial probabilities for large values of n and small values of p\ for 
example, when n ^ 20 and p ^ 0-05, and when n ^ 50 and p ^ 0T. For any 
given small value of /?, the larger n is, the more accurate will be the 
approximation. 

By applying the same limiting conditions to the mean and variance of 
B(n,p), we can derive expressions for the mean and variance of the Poisson 
distribution. For the binomial, we had mean np , variance np( 1 — p). We 
know that p = np, and, hence, in the limit, as p — > 0, we have, for the Poisson 
distribution, 


mean E(2T) = p , 

Var(X) = p( 1 - p) -> p. 

For the Poisson distribution, then, the mean and the variance are equal, both 
being equal to the parameter p. 

So far we have considered the Poisson distribution as an approximation to 
B (n,p) for small values of p and large values of n. There is another way in 
which the Poisson distribution may occur. It can be shown by mathematical 
argument that, if X is the number of events that occur in an interval of fixed 
length, then X has a Poisson distribution provided that the events occur 
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(a) singly, 

(b) independently of one another, 

(c) uniformly — that is, the expected number of events in a given interval is 
proportional to the size of the interval, and, 

(d) at random in continuous space or time. 

There are many examples for which the Poisson distribution is a good 
model. For example, the number of flaws in a length of cloth, the number of 
sultanas in a cake, the number of telephone calls coming into an exchange in a 
certain length of time, the number of a-particles emitted per unit time from a 
radioactive source. 


Example 7 On average, 1% of the lenses being produced by a machine are 
faulty. Estimate the probability that a random sample of 100 lenses taken 
from the production of this machine will contain 

(a) not more than 1 faulty lens, 

(b) more than 2 faulty lenses. 

Let X represent the number of faulty lenses. Then X ~ B(100,0-01), since 
the probability of a faulty lens is 0*01 and the sample size is 100. We can 
approximate this by a Poisson distribution with 


p = np = 100 x 0*01 = 1. 

(a) P(X > 1) = P(0) + P(l) = e -1 + l.e" 1 = 2e _1 - 0-736. 

(b) P(X > 2) = P(3) + P(4) + . . . . 

The right-hand side of this requires an infinity of probabilities to be evalu- 
ated, which is not practicable. Instead, we must use 


P(* > 2) = 1 

= 1 

= 1 


- P(0) - P(l) - P(2) 

12 o-l 


5e _1 

— - 0-080. 

2 


Example 8 Large rolls of velvet are being produced on a loom. The number 
of imperfections per 3 m of the velvet is a random variable having a Poisson 
distribution with p = 2-3. Find the probability that 3 m of the velvet chosen at 
random from the roll will have 

(a) 3 imperfections, 

(b) at most 2 imperfections. 

Write down the variance of the distribution. 

Here X , the number of imperfections per 3 m of velvet, has a Poisson distri- 
bution parameter 2-3. 
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(a) P(A" = 3) = - — ^ = 0*203 to three decimal places. 

(b) P(X ^ 2) = P(0) + P(l) + P(2) = e" 2 3 (l + 2*3 + (2*3) 2 /2) 

= 0*596 to three decimal places. 

Variance = mean = 2*3. 

Example 9 The number of bacteria in 1 ml of inoculum is known to follow a 
Poisson distribution with mean 3*1. If at least 3 bacteria are required for a 
1 ml dose to cause infection, show that the probability of a dose causing infec- 
tion is approximately equal to 0*6. 

Calculate the probability that, if 6 doses are administered, at least 2 of them 
will cause infection. 

The probability of a dose causing infection is the probability of having 3 or 
more bacteria. 


P(^3) = 1 - P(0) - P(l) - P(2) 

= 1 - e _31 (l + 3*1 + (3T) 2 /2) 

= 1 - 0*41 - 0*6. 

Six doses are administered with probability 0*6 of a single dose causing 
infection. If X is the number of doses causing infection, then X ~ B(6, 0*6). 

P(>2) = 1 - (0*4) 6 - 6(0*4) 5 (0*6) = 1 - 4(0*4) 5 
= 1 - 0*041 - 0*96. 


Exercise 3.5 

1 Telephone calls coming into a switchboard follow a Poisson distribution with 
mean 2 per minute. Find the probability that in a given minute there will be 4 or 
more calls. 

2 The road accidents in a certain area occur at an average rate of 1 every 3 days. 
Calculate, to three decimal places, the probability of 0, 1, 2, * , 5 accidents per 
week in the district. Obtain the most likely number of accidents per week. 

3 An airline finds that, on average, 3% of the persons who reserve seats for a 
certain flight do not turn up for the flight. Consequently, the airline decides to 
allow 200 people to reserve seats on a plane which can only accommodate 196 
passengers. Find the probability that there will be a seat available for every 
person who has reserved a seat and who turns up for the flight. 

4 In an examination 70% of the candidates pass but only 3% obtain Grade A. Use 
the binomial distribution to estimate the probability that a random group of 9 
candidates will contain at most 2 failures. Use the Poisson distribution to estimate 
the probability that a random group of 50 candidates will contain not more than 
one with Grade A. 

5 In the manufacture of radio sets, on the average 1 set in 25 is defective. Use the 
Poisson distribution to estimate the probability that a consignment of 100 radio 
sets will contain 
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(a) no defective set, 

(b) fewer than 4 defective sets. 

A random sample of 25 of the radio sets is found to contain 3 defectives. Is it 
likely that on average only 1 radio set in 25 is defective? 

6 Two hundred misprints are distributed randomly throughout a book of 600 pages. 
Find the probability that a given page will contain 

(a) exactly 2 misprints, 

(b) 2 or more misprints. 

7 In a trial, 7 coins are tossed together. Given that 100 trials are made, find the 
number of times one should expect to obtain 5 ‘heads’ and 2 ‘tails’. 

8 For the Poisson distribution with probability function 

#i r e ^ 

P(X = r) = !L — , r = 0, 1, 2, 
r\ 


where fi is a constant, show that E(A") = /jl. 
Show also that Var(X) = /jl. 


Hint: Write ^LE±. 

i r\ 


_ y r(r ~ l)^ r e M 
i r\ 


+ 2 


r\ 




Miscellaneous Exercise 3 

1 In a game a player cannot start until he has thrown a ‘6’ on a die. Calculate the 
probability that he has to throw the die more than 3 times before he can start. 

2 A die is thrown repeatedly until a ‘6’ is thrown. Given that an odd number of 
throws is required, calculate, to three significant figures, the probability that 3 
throws are required. 

3 A large batch of earthenware mugs is moulded and fired. After firing, a random 
sample of 10 mugs is inspected for flaws before glazing, decoration and final firing. 
Given that 25% of the mugs in the batch have flaws, use the appropriate binomial 
distribution to calculate, to two significant figures, the probability that the 
random sample contains 

(a) no mugs with flaws, 

(b) exactly 1 mug with a flaw. 

The batch is accepted without further checking if the random sample contains no 
more than 2 mugs with flaws. Find the probability that the batch will be accepted 
without further checking. 

4 A gun is firing at a target and it must make at least 2 direct hits. The probability of 
a direct hit with a single round is |, and this probability remains constant through- 
out the firing. Four rounds are fired, and, if at least 2 direct hits are scored, firing 
ceases. Otherwise, 4 more rounds are fired. Find the probability that at least 2 
direct hits are scored 

(a) only 4 rounds being fired, 

(b) 8 rounds having to be fired. 

5 An examination consists of 9 questions in each of which the candidate must say 
which one of 5 answers is the correct one. For each question a certain candidate 
guesses any one of the 5 answers with equal probability. 

(a) Prove that the probability that he obtains more than 1 correct answer is equal 

5 9 - 13 x 4 8 
to • 


46 Probability 



(b) Find the probability that he obtains correct answers to 7 or more questions. 

6 State clearly the conditions under which a binomial distribution applies. 

A man canvasses people to join a book club. For each new member he receives 
£1. The probability that a person he canvasses will join is 0T5. Calculate, to three 
decimal places, the probability that he will obtain 3 or more new members from 
10 people canvassed. State the amount of money that he would be expected to 
obtain on average from 20 canvassings. State how many people he would need to 
canvass each evening to average 3 new members per evening. Calculate the 
minimum number of people he must arrange to canvass to be 99% certain of 
obtaining at least one new member. 

7 Write down an expression for P(2T = r ), where X is the number of successes in 20 
independent trials each having a probability of 0-05 of success. Write down the 
values of E(X) and Var (X). 

A large batch of cigarette lighters is accepted if either 

(a) a random sample of 20 lighters contains no defective lighter, or 

(b) a random sample of 20 lighters contains one defective lighter only, and a 
second sample of 20 is then drawn and found to contain no defective lighter. 
Otherwise, the batch is rejected. If, in fact, 5% of the lighters in the batch are 
defective, find the probability of the batch being accepted. Find the expected 
number of lighters that will have to be sampled to reach a decision on a batch. 

8 In the mass production of plug sockets it is found that, on average, one in 30 plug 
sockets is defective. Assuming that the number of defectives in a random sample 
of plug sockets follows a Poisson distribution, show that there is a probability of 
approximately 0-05 that a random sample of 60 plug sockets will contain more 
than 4 defectives and a probability of less than 0-01 that it will contain more than 6 
defectives. 

9 Random samples, each of volume 1 cm 3 , are taken from a well-mixed cell sus- 
pension in which, on average, 1 cell of type A is present per cm 3 . Find the proba- 
bility that a sample will contain fewer than 3 cells of type A. Given that 10 such 
samples are taken, find the probability that exactly 4 of these will contain fewer 
than 3 type A cells. 

10 At a factory, 15% of the cassette players made are defective. Find the probability 
that a sample of 5 cassette players will contain 

(a) no defective, 

(b) exactly 1 defective, 

(c) at least 2 defectives. 

11 In a laboratory experiment, an untrained mouse can pass through 3 doors, behind 
only 1 of which is food. It goes through the doors until it finds the food. Assuming 
that the piouse has no memory and that it chooses the doors at random, write 
down the probability distribution for the number of doors it tries in order to get 
the food. 

A mouse which has been trained in the experiment is thought to have devel- 
oped a memory so that it will not try any door more than once. Assuming this is 
true, write down the corresponding probability distribution for the trained mouse. 
Find the probability that 

(a) the trained mouse will obtain food with fewer doors tried than the untrained 
mouse, 

(b) the untrained mouse will obtain food with fewer doors tried than the trained 
mouse. 

12 In a large batch of dried fruit, it is known that the average proportion of sultanas 
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is 0*55 and the average proportion of pieces of candied peel is 0-005. Calculate to 
three decimal places 

(a) the probability of there being 2 or more sultanas in a random sample of 10 of 
the dried fruits, 

(b) the probability of there being 3 or fewer pieces of candied peel in a random 
sample of 400 of the dried fruits. 

13 The number of bacteria in a dose of 1 ml is known to follow a Poisson distribution 
with mean 3-5. If at least 3 bacteria are needed for a dose to cause infection, show 
that the probability of a dose of 1 ml causing infection is approximately 0-679. 
Find the probability of infection being caused if the dose is trebled. 

14 A single die is thrown 6 times. Find the probability that 

(a) at least one ‘6’ is thrown, 

(b) fewer than two ‘6’s are thrown, 

(c) each face of the die turns up once. 

Find also the probability that, when a die is thrown repeatedly, a ‘6’ appears for 
the first time at the A:th throw. 

Find the least value of k such that there is more than a 60% chance of obtaining 
a ‘6’ on the kth throw or earlier, and find the values of E(/c) and Var(A:). 

15 Failures of the steering mechanism of a car occur at random with, on average, 1 
failure in 300000 miles. Use the Poisson distribution to find, to four decimal 
places, the probability that 

(a) the car completes 45 000 miles without a steering failure, 

(b) there are 3 or more failures in 45 000 miles. 

Two cars, X and Y, with this type of steering mechanism, are bought, and, 
while X is running its first 45 000 miles, Y will run 150000 miles. Find the 
probability that during this period there will be not more than one steering failure 
altogether. 

16 An examination paper consists of 20 questions, to each of which the candidate has 
to answer ‘True’ or ‘False’. A candidate answers the paper entirely by guessing — 
that is, he chooses the answers ‘True’ and ‘False’ with equal probability. Each 
question carries equal marks. Use a table of cumulative binomial probabilities to 
show that, if the pass mark is 75%, the probability that he passes is approximately 
0 - 021 . 
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4 Some continuous probability 
distributions 


We now discuss three continuous probability distributions that occur fre- 
quently in statistical work: the uniform, the exponential and the normal 
distributions. 

4.1 The continuous uniform distribution 

A continuous random variable X whose probability density function (pdf) is 
given by 

f(x) = - — - — for a ^ x ^ b, where a , b are finite, 
b — a 

f(x) = 0 otherwise, 

has a continuous uniform distribution over the interval [ a , b\. We can see that 
this is a valid density function, since f(^) ^ 0, and 

f + f(*)dx = P— ck = 1. 

J - oc J a b - a 

The mean of the distribution, E(JQ, is given by 

EPO = J + }f(*)dx = j\dx = 

an obvious result when we consider the sketch of f(x) as shown in Fig. 4.1. 
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The variance of the distribution is given by 

Var(A) = E(X 2 ) - [E(A)] 2 

f b x 2 J (a + b f 

‘j—a 6 *-—!- 

_ b 1 - a 3 _ (a + b) 2 
3 (b — a) 4 

_ b 2 + ab + a 2 a 2 + lab -1- b 2 
= 3 4 

b 2 — lab + a 2 (b — a) 2 

= 12 " li ' 

The distribution function is given by 

F(jc) = 0 for j k < a, 

^ , [ x 1 j x - a _ 

F(x) = djt = for a ^ x ^ 6, 

J a b - a b - a 

F(x) = 1 for x > b. 

This distribution has the property that the probability of X lying in any 
range within [ a , b ] is the same as the probability of X lying in any other range 
of the same length in [ a , b\. It can be used as a model for the rounding errors 
made when measurements are taken. For example, if we are measuring to the 
nearest millimetre and we may assume that the rounding error is equally 
likely to take any value between —0*5 mm and +0*5 mm, then we may take 
the uniform distribution over [-0-5, +0*5] as a reasonable model for the 
distribution of the rounding errors. Alternatively, one might assume that 
certain fibres, which are being tested for strength, break along their lengths at 
a distance X from one end, where X has a uniform distribution, and then this 
model could be tested against experimental results. 


Example 1 The hardness on the Brinell scale of certain magnesium alloys is 
assumed to follow a continuous uniform distribution over the interval 
[27,80]. Write down the pdf, and find the mean and the variance of the 
distribution. 


f(*) = 


i i_ 

80 - 27 “ 53 
f(x) = 0 otherwise. 

E(X) = ^4^ = 53-5. 


for 27 


80, 


Var(X) = 


(80 - 27) 2 2809 


12 


12 


» 234-1. 


Example 2 A point is chosen at random on the line segment [1,3]. Find the 
probability that the point will lie between 1-5 and 2. 
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Let X represent the point; then X is equally likely to take any value in [1,3], 
since it is chosen at random. Hence, X is uniformly distributed on [1,3] with 
pdf 


fw 

f(*> 

P(l-5 < * < 2) 


1 1 

r = — for 1 =£ x =£ 3, 

3-12 

0 otherwise. 

2 


f 2 

1 

2 

J 1-5 


Idx — 4. 


Example 3 A point is chosen at random on a line of length a. Find the pro- 
bability that the ratio of the length of the shorter segment to the length of 
the longer segment is less than 5 . 


Let X represent the distance from the point to one end of the line, so that the 
length of the other part is (a - X). Then X is uniformly distributed with pdf 


Then either 


f(jt) = — for 0 ^ x ^ a, 
v y a 

f(x) = 0 otherwise. 


X 

< 

a - X 


^X< 


a 

7’ 


P 




1 a 
a ' 4 


4 ’ 


or 


a - X 1 v 3a 


P IX > 


3 a 

T 


= 1 - F 


=> probability = 


3 a 

4 
1 


= 1 - 


1 1 

+ t = y- 


4' 


This is an obvious result if we look at Fig. 4.2. We have found that the point 
may lie on AC or on DB, but not on CD; that is, it may lie on half of the line 

\AB 1 

AB. Hence, the probability is ^ = — . 


a/4 a/2 a/4 

• • • • 

AC D B 


Fig. 4.2 
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Exercise 4.1 

1 X is a continuous random variable with pdf f(jt), where 

f(jt) = l for 0 ^ x ^ 8, 
f(x) = 0 elsewhere. 

Calculate P( 2 ^ X 6). Find F(x), the distribution function, and illustrate F(x) 
by a sketch. 

2 Given that A" is a continuous random variable uniformly distributed with pdf f(x), 
where 

f(x) = k for 2 x ^ 4, 
f(x) = 0 elsewhere, 

find the values of k and of E(A") and Var(2Q. 

Determine the distribution function F(jc), and illustrate F(x) by a sketch. 

3 A continuous random variable X has a probability density function f(;t) given by 

f(x) = k for 0 ^ x 4, 
f(x) = 0 elsewhere. 

Find the value of k , and determine the distribution function F(jc). Evaluate 
P(X ^ 3). 

4 The random variable X has a continuous uniform distribution over the interval 
[a,b\. Prove that the probability that X will take values less than a + m(b — a) is 
equal to m, given that 0 ^ m ^ 1. 


4.2 The exponential distribution 

A continuous random variable X whose pdf f(jc) is given by 

f(x) = Xe _x * for x ^ 0, where X is a positive constant, 
f(x) = 0 otherwise, 

has an exponential distribution with parameter X. Since 



and f(jc) ^ 0, this is a valid density function. 

E(X) = j 

00 

Ajce - ^cU 

o 


r 

-1=0 I 

"oo 

= 

-xe“^ + 

e _x *djt, integrating by parts, 

L 

r 

Jo J 

-X* loo 1 

0 

-i 

II 

0 

/< 

1 8 

for X > 0. 

E(JT 2 ) = 

Xjc 2 e ^dx 

) 0 



loo 

foe 

= 

-x 2 e~ Xx + 
Jo 

2 xe~ Kx dx, integrating by parts 

J 0 
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2 2 

- T E « * 5?- 


The distribution function, F(x), is given by 
F(jc) = 0 for jc < 0, 


F(jc) 


J 0 


Xe-^dx = 


-e 


= 1 - e Xx , for x ^ 0. 


Jo 


Many real-life situations can be modelled by an exponential distribution. 
For example, suppose that a machine is testing a particular electrical com- 
ponent. Let T represent the length of time that a component in continuous 
use lasts before it fails. If T has a constant failure rate — that is, after a 
component is in use its probability of failing does not change (there is no 
wearing effect) — then T can be modelled by an exponential distribution. 

The waiting time between events in a Poisson process will have an 
exponential distribution, with parameter X having the same value as the 
respective Poisson parameter /jl , provided that the units of time are the same 
in both distributions. For example, the time interval between vehicles passing 
a certain point when traffic is flowing freely, each vehicle travelling indepen- 
dently of (that is, not restricted by) any other vehicle, can be considered as an 
exponential variable. 

An interesting property of the exponential distribution is that it has ‘no 
memory’. That is, suppose that event E has not occurred during time T\ then 
the probability that it does not occur during the following interval of time t is 
the same as the probability that it would not have occurred during time t from 
the start. Expressed in symbolic form, this is 

P(X > T + t\X > T) = P(X > t). 


In other words, the property of ‘no occurrence’ during the first T units of time 
is ‘forgotten’ so far as the subsequent time t is concerned. 

Proof 


P(X > T + t\ X > T) = 


P (X>T+ t) 
P(X > T) 


i — F(r + 1) 
1 - F (T) ' 


We have shown that F(x) = 1 - e ^ and, hence, 

e -HT +t) 

P(X > T + t\X > T) = = e _Xf 

= ?(X > i). 

This same property of ‘no memory’ can also be shown to apply to the geo- 
metric distribution discussed in Chapter 3. 
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Example 4 The mileage, in thousands of miles, before tyre replacement is 
necessary in lorries using a certain cross-ply tyre is a random variable which 
has an exponential distribution with parameter State the mean mileage, 
and find the probability that a tyre of this type will last for 

(a) at most 20000 miles, 

(b) more than 40000 miles. 

E(X) = = 60 => mean mileage is 60000 miles. 

X 

The exponential distribution function gives 

PCX' ^ M) = 1 — e~ M/60 , where M is the mileage in thousands of miles. 

(a) F(X ^ 20) = 1 - e~ 20/60 = 1 - e“ 173 - 0*283. 

(b) F(X > 40) = 1 - F(X ^ 40) = e” 40760 = e“ 273 - 0*513. 


Example 5 Given that the length of the life of a light-bulb of a certain type is 
an exponentially distributed random variable with expected lifetime of 15 
weeks when continuously lit, find the probability that, if 3 such bulbs are lit at 
the same time and left alight, at least 2 of them will still be alight after 20 
weeks. 

X , the length of life in weeks, is distributed exponentially with parameter X, 

1 1 
given by y = 15, X = — . 

e -*/15 

f(jt) = i for x 25 0. 

F(jc) = 1 - e - * 715 . 

F(X < 20) = F(20) = 1 - e~ 20715 = 1 - e~ 473 
=> F(X ^ 20) = 1 - F(X < 20) = e -473 , for each bulb. 

Hence, the probability that at least 2 of the 3 bulbs are still alight is given by 
the sum of the probabilities that only 2 are alight and that all 3 are alight. 

= 3e -473 . e _4/3 (l - e" 473 ) + e - 473 .e- 473 .e- 473 
= e“ 873 (3 - 2e“ 473 ) ^ 0*172. 


Example 6 The time, T seconds, between the arrival of successive vehicles 
at a point on a country road has pdf given by f (t) = ke~ xt when t ^ 0, and 
where X = -^ 0 ". State the mean and variance of T. 

A pedestrian takes 50 seconds to cross the road at this point. Find the 
probability that if he starts to cross as a vehicle passes by, he will reach the 
other side before another vehicle gets to that point. 

He crosses back with the same procedure. Find the probability that he 
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completes each crossing without another vehicle arriving at his crossing point 
while he is crossing. 

T is distributed exponentially with parameter X = 

mean time = — = 100 s. 

\ 

1 9 

variance = ~2 — 10000 s . 

F(0 = 1 - e~' /10 °. 

P(f > 50) = 1 - P(f =£ 50) = e - 50/10 ° = e -1/2 = 0-607. 

On the first crossing P(f > 50) ~ 0-607. 

On the second crossing P(t > 50) ~ 0-607. 

P it > 50 on each crossing) ~ 0-607 x 0-607 (by independence) 

- 0-368. 


Exercise 4.2 

1 A police radar trap catches X motorists per hour who are speeding on a certain 
stretch of motorway. Assuming that X is a Poisson random variable with 
parameter 6-3, find the probability that the time between successive speeders is 
less than 15 minutes. 

2 Accidents occur at random on a certain road at the rate of 2 per day. Find, to 
three decimal places, the probability that, after one particular accident has 
occurred, there will be at least 12 hours before the next accident. 

3 Vehicles pass a particular point on a main road at the rate of 120 per hour. Write 
down the pdf of the time in minutes between 2 consecutive vehicles. A man takes 
30 seconds to cross the road. Find the probability that he will be able to complete 
his crossing between 2 consecutive vehicles. 

4 A certain make of electric iron needs repairing on average once every 3 years. 
Assuming that the times between repairs are exponentially distributed, find the 
probability that an iron of this make will go on working for at least 4 years without 
needing a repair. 

5 A target contains three concentric circular rings X, Y, and Z of radii 1,2, and 3 
units, respectively. A shot scores 5 points if it falls inside X, 3 between X and Y, 1 
between Y and Z, and 0 outside Z. If the probability density function of r, the 
distance of a shot from the centre, is e~ r , find the probability of each possible 
score, and show that the expected value of the score for a single shot is 
approximately 3-94. 


4.3 The normal distribution 

The continuous random variable X whose pdf f(.v ) is given by 


f(*) = 


aV(7") 


e -(*- M )W) for all real Xt 
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where /z and a are real constants and a > 0, has a normal (or Gaussian) dis- 
tribution. We can write this X ~ N(/z, ct 2 ), to be read as X is normally distri- 
buted with parameters /z and a 2 . This distribution is probably the most 
important of all the theoretical probability distributions, and it serves as a 
model for data from many fields of study. It was first investigated in the 
eighteenth century by mathematicians who were studying the distributions of 
errors when quantities were measured, although its mathematical properties 
were not fully developed until the work of Gauss in the early nineteenth 
century. Soon after this, Quetelet, the astronomer, was the first to use a 
normal distribution in the study of biological quantities, and Galton used it in 
his work on heredity later in the nineteenth century. 

From the pdf we can state some important properties of the distribution. 

(i) The curve representing f(x) is symmetrical about the line x = fi. 

(ii) x can take all values between — °° and +o°. 

(iii) The jc-axis is an asymptote to the curve at both ends. 

(iv) The maximum value on the curve is at x = /z, where the ordinate is 

1 

a\/(277) 

A sketch of the distribution is shown in Fig. 4.3. 



In order to find the mean and variance of this distribution, we use the 
values of three definite integrals, two of which involve calculus beyond the 
scope of this text. 

(i) f e"' 2 ' 2 d t = V(2jt), 

J -oo 
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= o, 


as is otherwise obvious because the integrand is an odd function, 

r+oc 

(iii) I t 2 e~ f2/2 d t - ^J{2 tt). 

Using these results, 

E(AT) = f iTTT e -(x-V-f/(2<r 2 ) fa 

J -oof TA 


tV(2tt) 


= a p 

aV(27r)J _ 

= f +X 

V(2J7)J -=c 
mV( 2n) 

= 0 + W = 


(zct + /x) e 2 dz, where 


x — (l 
a 


= z, 


ze -2 72 dz + 




V(27t)J _ 


/: 


e 2 2 dz, 


Thus, the mean is the parameter /x, as is clear from the symmetry of the curve 
in Fig. 4.3. 


E(X 2 ) 


. -00 O- 


V(2 it) 


e -(*-,x)W) ^ 


= fj 2 ’ + rt2 <r ‘ ! ' 2 dz - where 2 - X ~^r- 

= v<y < t 2 / 1 zVI ‘' 2 d2 + HI 2 '"’' 2 dz + dz ] 

[cr 2 V(27r) + 0 + /x 2 V(27t)] 


= (J 2 + (JL 2 . 

=> Var(Jf) = (a 2 4- /x 2 ) - /x 2 = a 2 . 


Hence, the variance is the parameter a 2 . 

To calculate probabilities associated with N(/x,cr 2 ), we would need to 
evaluate the integral of f(x) between the two limits in which we are interested. 
This is obviously a very difficult task, and so we simplify the problem by 
standardising the variable in order to transform N(/x,cr 2 ) to the standard 
normal distribution N(0,1). This latter distribution has a mean of zero and a 
variance of 1; that is, we have transformed the mean to the origin, and, in 
finding areas under the curve for the probabilities, we can use the symmetric 
property of the distribution curve. 

In Chapter 2, we had the results 

E(aZ + b) = aE(Z) + b, Var (aX + b) = a 2 V ai(X). 
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If we make the transformation 


then 

E(Z) = — EpO - — = ^ = 0, 

a or or 

Var(Z) = ^Varm = = 1. 

a a 

This transformation transforms the mean fi for X to the mean 0 for Z, 
and variance a 2 for X to variance 1 for Z, transforming X ~ N(/x,,a 2 ) into 
Z ~ N(0,1). Z is called the standard normal variate. 

The pdf of N(0,1) is usually written as c(>(z) and the distribution function as 
d>(z). 

We have 


<t>(z) = 


V(2 it) 


-z 12 


and 

P(Z < z) 


” L Mz>dz = Lw: i,) 


-z l !2 


dz. 


This integral cannot be evaluated by elementary calculus, but it can be evalu- 
ated numerically, and these numerical evaluations have been tabulated for 
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varying values of z. A sketch of N (0,1) is shown in Fig. 4.4, the shaded region 
representing <J>(zx). Since the total area under the curve must be equal to 1 
(certainty), the area on each side of the line z = 0 must equal 0*5 by sym- 
metry. The tables of values connecting z and <F(z) are given, principally, in 
three different forms. 

Some tables show corresponding values of z and <F(z); that is, they give 
values of the shaded area in Fig. 4.4 and the corresponding z-values. Of this 
type of table, some show both negative and positive values of z. Others show 
only positive values of z and expect us to use the symmetry of the curve and 
the fact that the total area under the curve is 1, to deduce the corresponding 
value of <J>(z) for z < 0. For example, to find the shaded area in Fig. 4.5, we 
need to find <&(— zx). By symmetry, this area is equal to the area from z = Zx 
to z = +oo. Hence, 


<D(-Zl) = 1 - <&(z x ). 

We can use this equation and the value of <F(zx) for the corresponding posi- 
tive value of Zx in order to find <&(— Zx). 



A second form of table gives the area under the curve for the interval 
(z,+o°), and this means, as we can see from Fig. 4.4, that the table is showing 
us corresponding values of z and 1 - <F(z). 

A third form of table gives the area under the curve from z = 0 to z = Zx, 
the shaded region of Fig. 4.6, and, hence, shows corresponding values of zx 
and [<F(zi) — 0*5]. Any other associated area can then be obtained by 
symmetry. 
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Fig. 4.6 


Although this availability of different types of table may seem confusing, 
this is not really so, as all tables are clearly marked in words, or a diagram 
shows the area values which are given by the table corresponding to each z. In 
problem solving, we soon become used to using one particular type of table. 
For the solved problems in this text, a table showing <F(z) for corresponding 
positive z values has been used (Table 4.1). 


Table 4.1 The distribution function, <F(z), 
of the normal probability function, c|>(z) 

= e“* 2/2 , *(z) =J Z 0 (t) d t. 
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Table 4.1 ( continued ) 


2 

•00 

•01 

•02 

•03 

•04 

•05 

•06 

•07 

•08 

•09 

0-0 

•5000 

•5040 

•5080 

•5120 

•5160 

•5199 

•5239 

•5279 

•5319 

•5359 

0-1 

•5398 

•5438 

•5478 

•5517 

•5557 

•5596 

•5636 

•5675 

•5714 

•5753 

0-2 

•5793 

•5832 

•5871 

•5910 

•5948 

•5987 

•6026 

•6064 

•6103 

•6141 

0-3 

•6179 

•6217 

•6255 

•6293 

•6331 

•6368 

•6406 

•6443 

•6480 

•6517 

0-4 

•6554 

•6591 

•6628 

•6664 

•6700 

•6736 

•6772 

•6808 

•6844 

•6879 

0-5 

•6915 

•6950 

•6985 

•7019 

•7054 

•7088 

•7123 

•7157 

•7190 

•7224 

0-6 

•7257 

•7291 

•7324 

•7357 

•7389 

•7422 

•7454 

•7486 

•7517 

•7549 

0-7 

•7580 

•7611 

•7642 

•7673 

•7704 

•7734 

•7764 

•7794 

•7823 

•7852 

0-8 

•7881 

•7910 

•7939 

•7967 

•7995 

•8023 

•8051 

•8078 

•8106 

•8133 

0-9 

•8159 

•8186 

•8212 

•8238 

•8264 

•8289 

•8315 

•8340 

•8365 

•8389 

1-0 

•8413 

•8438 

•8461 

•8485 

•8508 

•8531 

•8554 

•8577 

•8599 

•8621 

M 

•8643 

•8665 

•8686 

•8708 

•8729 

•8749 

•8770 

•8790 

•8810 

•8830 

1-2 

•8849 

•8869 

•8888 

•8907 

•8925 

•8944 

•8962 

•8980 

•8997 

•9015 

1-3 

•9032 

•9049 

•9066 

•9082 

•9099 

•9115 

•9131 

•9147 

•9162 

•9177 

1-4 

•9192 

•9207 

•9222 

•9236 

•9251 

•9265 

•9279 

•9292 

•9306 

•9319 

1*5 

•9332 

•9345 

•9357 

•9370 

•9382 

•9394 

•9406 

•9418 

•9429 

•9441 

1-6 

•9452 

•9463 

•9474 

•9484 

•9495 

•9505 

•9515 

•9525 

•9535 

•9545 

1-7 

•9554 

•9564 

•9573 

•9582 

•9591 

•9599 

•9608 

•9616 

•9625 

•9633 

1-8 

•9641 

•9649 

•9656 

•9664 

•9671 

•9678 

•9686 

•9693 

•9699 

•9706 

1-9 

•9713 

•9719 

•9726 

•9732 

•9738 

•9744 

•9750 

•9756 

•9761 

•9767 

2-0 

•97725 

•97778 

•97831 

•97882 

•97932 

•97982 

•98030 

•98077 

•98124 

•98169 

2-1 

•98214 

•98257 

•98300 

•98341 

•98382 

•98422 

•98461 

•98500 

•98537 

•98574 

2-2 

•98610 

•98645 

•98679 

•98713 

•98745 

•98778 

•98809 

•98840 

•98870 

•98899 

2-3 

•98928 

•98956 

•98983 

•99010 

•99036 

•99061 

•99086 

•99111 

•99134 

•99158 

2-4 

•99180 

•99202 

•99224 

•99245 

•99266 

•99286 

•99305 

•99324 

•99343 

•99361 

2-5 

•99379 

•99396 

•99413 

•99430 

•99446 

•99461 

•99477 

•99492 

•99506 

•99520 

2-6 

•99534 

•99547 

•99560 

•99573 

•99585 

•99598 

•99609 

•99621 

•99632 

•99643 

2-7 

•99653 

•99664 

•99674 

•99683 

•99693 

•99702 

•99711 

•99720 

•99728 

•99736 

2-8 

•99744 

•99752 

•99760 

•99767 

•99774 

•99781 

•99788 

•99795 

•99801 

•99807 

2-9 

•99813 

•99819 

•99825 

•99831 

•99836 

•99841 

•99846 

•99851 

•99856 

•99861 

3-0 

•99865 

•99869 

•99874 

•99878 

•99882 

•99886 

•99889 

•99893 

•99896 

•99900 

z 

3-1 

3-2 

3-3 

3-4 

3-5 

3-6 

3-7 

3-8 

3-9 

4-0 

<D 

•99903 

•99931 

•99952 

•99966 

•99977 

•99984 

•99989 

•99993 

•99995 

•99997 


Sometimes it is necessary to interpolate linearly for a value of z which lies 
between given values of z in the table, just as we would interpolate when 
using other tables such as sine or cosine tables. 

You will notice that, in all three types of table mentioned, there is a definite 
base from which the integral is evaluated, — o°, +oo, and 0, respectively. 
Should we wish to find the probability that z lies between z x and z 2 , we must 
take the areas from whichever base is used in the particular table with which 
we are working, and add or subtract areas as necessary. 
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When a problem is being worked, a diagram is extremely helpful, not only 
because we may need to obtain areas other than those given directly by the 
table, but also so that we can really understand the problem and locate the 
area that we need. For example, the shaded area in Fig. 4.7 represents 

P(zi Z « Z 2 ) = <f>(z 2 ) - ^Ol). 



Fig. 4.7 



Fig. 4.8 
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Example 7 Find F(X ^ 0) when X is distributed 

(a) N(— 3,36) — that is, with mean —3, variance 36, 

(b) N(3,36). 


(a) Transforming, we have 

? X-H Q-(-3) 1 

a 6 2’ 

since p = -3, or 2 = 36. X ^ 0 corresponds to Z ^ where Z ~ N(0,1). We 
require the shaded area shown in Fig. 4.8 — that is, 

P(Z ^ \) = 1 - <&(£) = 1 - 0-6915 = 0-308(5). 

0-3 1 

(b) Z = — - — = — - and X ^ 0 corresponds to Z ^ where 

Z ~ N(0,1). We require the area under the curve from z = to z = +<*> in 
Fig. 4.8. By symmetry, this is equal to the area from z = — to z = + 5 , which 
is <t.(i) 

=> P(X 0) = P(Z 5= -\) = <J>(j) = 0-691(5). 


Example 8 Given that the IQ scores of young people between 15 and 17 
years of age are distributed N(100,169), find the proportion of this age group 
who have IQs (a) above 110, (b) below 75, (c) above 80, (d) between 75 and 
80, (e) between 110 and 140, (f) between 80 and 110. 



Fig. 4.9 
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p = 100, <r 2 = 169, (T = 13. 
X - fi X - 100 


Z = 


cr 


13 


and Z ~ N(0,1). 
10 


(a) * > 110 corresponds to Z > — . (See Fig. 4.9.) 


P(X > 110) = P Z > 


10 

13 


= 1 - * (g I = 1 - 0-7791 


0 - 221 . 


25 

(b) X < 75 corresponds to Z < — — • 

P(X < 75) = P^Z < = 1 - b y symmetry, 

* 1 - 0-9728 » 0-027. 

20 

(c) * > 80 corresponds to Z > — — . 

P(* > 80) = p(z > = *(j2) - 0-938. 

25 20 

(d) 75 < X < 80 corresponds to — — < Z < — — . 


P(75 < X < 80) = P 


25 „ 20 

- — < Z < 

13 13 


= ^(if) ~ ‘b(n)’ by s y mmetr y’ 

= 0-9728 - 0-9380 » 0-035. 

10 40 

(e) 110 < X < 140 corresponds to — < Z < — . 


P(110 < X < 140) = p(j2 < Z < ^ 

= 0-9990 - 0-7791 
20 


= *|12 -* 12 

,13/ \13 

0 - 22 . 

10 


(f) 80 < X < 110 corresponds to < ^ < 13’ 

P(80 < X < 110) - p(-§ < Z < g) . *(g) - *(-2 


= *(12j _ 
\13/ 


1 


*12 

\13 


J 10 \ ./ 20 

= * — + * — 
\13/ \13 

- 0-717. 


Example 9 The dry weight of a certain bedding plant is a normal variable 
with mean 7 g and standard deviation 1-5 g. Find the dry weight which is ex- 
ceeded by 25% of these bedding plants. Find also the range of values, sym- 
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metric about the mean, which includes 98% of the dry weights of all the 
plants. 

* ~ N(7,(l-5) 2 ) Z = X ~ M where Z ~ N(0,1). 

a 1*5 

We want to find the value of Z such that the shaded area of Fig. 4.10 equals 
0*25, and, hence, <F(z) = 0*75. Using the tables so that we are looking up the 
value of z for a given <F(z), we find Z = 0*674 when <F(Z) = 0*75 

=> X ~ 5 1 = 0-674 => X = 8-011. 

The dry weight which is exceeded by 25% of the plants is 8*01 g. 




Fig. 4.11 
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For the symmetric interval which includes 98% of all the dry weights, we 
see in Fig. 4.11 that we need 

4 >(Z) = 0-5 + 0-49 = 0-99 
4> Z = 2-327 

=> X ~ 5 7 = 2-327 4> X = 7 ± 3-49 = (3-51, 10-49). 

The range within which 98% of the dry weights lie is 3*51 to 10*49 g. 

Example 10 Find the probability that a normally distributed random vari- 
able X differs from its mean by less than twice its standard deviation. 

Let X ~ NOuV), Z = X ~ where Z ~ N(0,1). 

a 

p-2v<X<p, + 2cj corresponds to —2 < Z < 2. 

P(— 2 < Z < 2) = 5>(2) - <F(-2) 

= 0(2) - 1 + 0(2) 

= 20(2) - 1 = 2 x 0*97725 - 1 = 0*9545. 

This result is sometimes called the ‘2-a Rule’; that is, that approximately 95% 
of the distribution lies within a distance of 2a either side of the mean /x, where 
a is the standard deviation of the distribution. In fact, l*96a either side of /x 
gives a better approximation for the 95% interval, but ‘2-a’ is a useful rule to 
use. 


The normal distribution used as an approximation to the binomial 
distribution B(n,p) 

For the binomial distribution with p = the sketch of the probability function 
is symmetrical and, if n is large, so that there is a large number of points 
plotted on the sketch, the graph has the characteristic shape of the normal 
distribution, although in B (n,p) the variable is discrete. Obviously, the larger 
n becomes, and the nearer p is to \ , the more closely will the sketch of B (n,p) 
approach a normal distribution shape. 

It can be proved mathematically that, if X ~ B (n,p), then for large n and 
p not too close to 0 or 1, the distribution of X may be approximated by 
N(np,npq ), where q = 1 - p. That is, B (n,p) can be approximated by a 
normal distribution with mean np and variance npq. A common rule of thumb 
for the conditions on n and p, is to use the approximation only when np and 
nq both exceed 5. 

However, when we make this approximation, we are approximating a 
discrete variable by a continuous one, and so we make a continuity correction. 
This means that we must take particular care when deciding on the end-points 
of the intervals involved. To do this, we consider each discrete point as 
representing a corresponding continuous range. For example, the integers 2 
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and 5 would be considered as the corresponding ranges 1^-2^ and 4^-52, 
respectively. Then, if we wished to find P(X ^ 2), we would use P(X < 2*5), 
and P(2^ = 5) would be P(4*5 < X < 5-5), in the approximate normal 
distribution. 

Example 11 Calculate P(X = 2) when X ~ B(14,0*4) 

(a) exactly, 

(b) using a normal approximation. 

(a) P(X = 2) = (^(0-6) 12 (0-4) 2 = 91(0-6) 12 (0-4) 2 « 0-0317. 

(b) Since np = 14(0*4) and npq = 14(0*4)(0*6), then T5 < X < 2*5 corres- 
ponds to 

1*5 -5*6 2*5 -5*6 

V(3*36) < V(3*36) 

4> -2*2367 < Z < -1*6912. 

P(— 2*2367 < Z < -1*6912) = $(2*2367) - $(1*6912) 

= 0*9873 - 0*9546 - 0*0327. 

4> P(X = 2) « 0*0327. 

We can see that the error in the approximation is quite small, approximately 
3%, even though n is not very large. 

Example 12 There is a probability of 0*75 that, when a fly is sprayed with a 
certain fly-killer, it will die. Given that 6 flies are sprayed, find the probability 
that at least 5 of them will die. 

A swarm of 100 flies is sprayed. Estimate the probability that at least 70 of 
them will be killed by the spray. 

Here we have a binomial distribution B(n,0*75) for X , the number of flies 
killed. 

For n = 6, P(^ ^ 5) = P(5) + P(6) = 6(0*25)(0*75) 5 + (0-75) 6 

= 2*25(0*75) 5 
= 0*534. 

For n = 100, we will use the normal approximation N(75, 18*75), and make a 
continuity correction. 

69*5 - 75 

Z > 69*5 corresponds to Z > ^ — = —1*2702. 

P(Z > -T2702) = $(T2702) by symmetry, 

= 0*898. 

P(>70 flies killed) - 0*898. 

Here a great deal of arithmetic would have been required to calculate the 
probability from B(100,0*75). 
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Example 13 A machine produces articles of which, on average, 20% are 
defective. Find an approximate value for the probability that a random sam- 
ple of 400 of the articles will contain more than 96 which are defective. 


p = 0*2, q = 1 — p = 0*8, n = 400 => X ~ B(400,0-2). 

We use the approximation N(80, 64), and a continuity correction. X > 96 

corresponds to Z > — — - — — = 2-0625. 

8 

P(Z > 2-0625) = 1 - 0(2-0625) = 1 - 0-9804 = 0-0196 ~ 0-02. 


The normal distribution used as an approximation to the Poisson 
distribution 

The Poisson distribution with mean and variance /x, can be approximated, for 
large values of /x, by N(/tx,/x). The usual rule is to interpret ‘large values’ as 
meaning /x > 20, although the approximation is quite good for /x > 10. The 
larger /x becomes, the better is the approximation. We are again approxi- 
mating a discrete random variable by a continuous one, and, as we did when 
approximating the binomial by a normal distribution, we make a continuity 
correction in the same way. 

Example 14 The number of deaths in road accidents in Greater London in a 
given month can be assumed to have a Poisson distribution with mean 22. 
Calculate approximate values for the probabilities that there will be next 
month 

(a) less than 20 road deaths, 

(b) 25 or more road deaths. 

Since /x = 22 we will use the approximation N(22,22) and a continuity 
correction. 

(a) X < 19-5 corresponds to Z < -- = —0-5330. 

P(Z < -0-5330) = $(-0-5330) = 1 - $(0-5330) = 0-297 
=> P(<20 deaths) - 0-297. 

24-5 - 22 

(b) X > 24-5 corresponds to Z > — ^ — — = 0-5330. 

P(Z > 0-5330) = 1 - $(0-5330) = 0-297 

=> P(25 or more deaths) ~ 0-297. 

Example 15 At an emergency centre, ‘999’ calls come in at an average rate 
of 3 per hour. Find the probability that there are 5 or more calls in a 2 hour 
period. Estimate the probability that there are 30 or more calls in a 12 hour 
period. 
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The conditions given in the question indicate that it is reasonable to assume 
that the number of calls in an interval of time h hours is a Poisson variable 
with mean /jl = 3h. 

For h = 2, fx = 6. 

P(X 2 ^ 5) = 1 - P(0) - P(l) - P(2) - P(3) - P(4) 

6 / £ 6 2 6 3 6 4 ^ 

\ 2 6 24 

= 1 - 115e“ 6 = 0-715. 

For h = 12, fL = 36. 

We use the approximation N(36,36), and a continuity correction. 

PC*„ > 29 5) - p( Z > . P (z > - 6 ' 5 ' 

= P(Z > -1-0833) 

= 0(1-0833), by symmetry, 

= 0-8607 

=^> P(30 or more calls in 12 hours) ~ 0-861. 


Exercise 4.3 

1 Given that X has distribution N(2, 0*16), evaluate 

(a) ?(X ^ 2-3), 

(b) P(l-8 ^ ^ 2-1). 

2 The diameter of a certain type of copper tubing is normally distributed with mean 
8 mm and variance 0-04 mm 2 . Find the probability that a diameter will exceed 
8-1 mm. A piece of this tubing is rejected if the diameter differs from the mean 
diameter by more than 0-25 mm. Find the probability that a piece of the tubing 
will be rejected. 

3 A large cargo of tomatoes has, on average, 1 bad tomato in 10. Find, to two 
significant figures, the probability that a random sample of 100 will contain 15 or 
more bad tomatoes. 

4 A machine produces articles of which, on average, 5% are defective. Use a sui- 
table normal approximation to calculate, to three decimal places, the probability 
that, in a random sample of 800 articles, more than 45 will prove to be defective. 

5 A telephone exchange receives calls at random at an average rate of 120 calls 
every hour. Use the normal approximation to the Poisson distribution to calculate 
the probability that fewer than 120 calls, but not fewer than 90 calls, are received 
in a 1 hour period. 

6 Bungs are manufactured which are to fit holes in barrels. The diameter X of the 
bungs is normally distributed with mean 48 mm and standard deviation 0-3 mm. 
The diameter Y of the holes is normally distributed with mean 49 mm and 
standard deviation 0-5 mm. 

(a) Find the proportion of bungs with diameter greater than 48-5 mm. 

(b) Find the proportion of holes with diameter less than 48-5 mm. 

(c) If the bungs and holes are selected at random, find the proportion of bungs 
that will be too big for the holes. 
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Miscellaneous Exercise 4 

1 The probability that a fluorescent light-tube lasts longer than t hours is e~ tlk . Find 
the probability density function for the lifetime of a tube and state the mean 
lifetime. 

Given that the mean lifetime is 2000 hours, find the probability that a tube will 
last more than 3500 hours. If the manufacturer wishes to ensure that fewer than 1 
in 1500 of his tubes fail before 10 hours of life, find the smallest mean lifetime 
which he can allow his tubes to have. 

2 A certain continuous value is recorded, to the nearest whole number, as 5. Given 
that the exact value has a uniform distribution, find the probability that the exact 
value is (a) between 4-8 and 5T, (b) greater than 4-9, (c) less than 4-7. 

3 The mass printed on a packet of sultanas by the manufacturer is 500 g. In fact, it is 
discovered that the packets coming from the factory have a mean mass of 505 g 
and a standard deviation of 2*5 g. Assuming that the masses are normally distri- 
buted, estimate the percentage of packets weighing between 500-5 g and 510-5 g. 
If the manufacturer decides to alter the mean mass so that 10% of the output is 
less than the intended mass of 500 g, find the new mean, assuming that the 
standard deviation remains unaltered. 

4 A tyre manufacturer guarantees to replace his tyres free if they fail within 1 year 
of purchase and to replace them at half-price if they fail in more than 1 year but 
less than 2 years. Replacement tyres are not replaced if they fail. From his 
experience of production over the years, the manufacturer knows that the time to 
failure has had a normal distribution with mean of 3-5 years and a standard devia- 
tion of 0-9 years. Calculate the probability that 

(a) a tyre will fail in under 1 year, 

(b) a tyre will fail in more than 1 year but under 2 years. 

The manufacturer sells a new tyre for £35, of which £25 is the cost of production. 
Calculate his expected profit from the sale of 1000 tyres. 

5 In a large cafe 1 in 3 of the customers buys a cup of tea. 

(a) Find the probability that at least 4 out of the first 9 customers will buy a cup of 
tea. 

(b) Given that the probability that 1000 customers will buy fewer than k cups of 
tea is 0-98, find k. 

Given that overall 2 customers per 1000 make a complaint and assuming that 
complaints occur independently, find the probability of receiving fewer than 2 
complaints from 200 customers. 

6 The probability density function f(t) of the time T to failure of an item is given by 

f(0 = — e~‘ lk (0 < t < oo). 

k 

Find the mean time to failure and the variance. 

Two components in an engine have failure time distributions corresponding to 
means k and 3k, respectively. The engine will stop if either component fails, and 
the failures of the two components are independent. Show that the chance of the 
engine continuing to work for a time k from the start is approximately 0-26. 

7 Electric bulbs have an average life of 3000 hours and 98% of the bulbs have a life 
of at least 2500 hours. Estimate, to two decimal places, the standard deviation, 
stating any assumption you are making about the distribution. 

Find the percentage of the bulbs which would be expected 


70 Probability 



(a) to last more than 3300 hours, 

(b) to fail in less than 2400 hours. 

8 The natural logarithm of the emerald content in carats of a cubic metre of a 
certain limestone is a normal variable with mean T69 and variance 0-212. Find the 
probability that a cubic metre of the limestone will contain 

(a) less than 1-41 carats, 

(b) between 1-41 and 2-84 carats. 

9 The time T seconds between the arrival of successive vehicles at a point on a 
road has pdf f (t) given by f (t) = Xe“' /a , for t ^ 0, where X and a are positive con- 
stants. Find X in terms of a and sketch the graph of the probability density 
function. 

Given that a = 50, state the mean and variance. 

A pedestrian takes 30 seconds to cross the road at this point. With a = 50, 
calculate the probability that, if she sets off as one vehicle passes, she will com- 
plete the crossing before the next vehicle arrives. Calculate also the probability 
that, if she adopts the same procedure on the return journey, she completes each 
crossing without a vehicle arriving while she is in the process of crossing. 

10 An athlete finds that, in throwing the discus, his distances form a normal distri- 
bution with mean 55 m and standard deviation 4-3 m. Calculate the probability 
that he will throw the discus more than 61 m on a given occasion. 

Find the probability that 3 independent throws will all be less than 61 m. 

Find the distance that he can expect to exceed once in 100 throws. 

11 A man leaves home at 07.30 every morning in order to arrive at work at 09.00. 
Over a long period, he finds that he is late once in 20 times. He then tries leaving 
home at 07.20 and finds that over a similar period he is late once in 50 times. 
Given that the time of his journey has a normal distribution, find the latest time at 
which he should leave home in order not to be late more than once in 100 times. 

12 Given that the probability of a male birth is 0-517, find, to three decimal places, 
the probability that there will be fewer boys than girls in 1000 births. Find, to the 
nearest hundred, the size of the smallest sample which should be taken so that 
the probability of fewer boys than girls is less than 0-03. (Assume here that the 
sample size is large enough to make a continuity correction unnecessary.) 

13 Axles are made of nominal length 1 m but in fact they form a normal distribution 
with mean 1*01 m and standard deviation 0-01 m. Each axle costs £5 to make and 
may be used immediately if its length lies between 0-99 m and 1-02 m. If its length 
is less than 0-99 m, the axle is useless but has a scrap value of £1. If its length 
exceeds 1-02 m, it may be shortened and used at an extra cost of £1-50. Find the 
cost per usable axle. 

14 In the assembly of an engine, a cylindrical rod with circular cross-section of 
diameter a has to fit into a circular collar of diameter A . Measurement of a large 
number of these rods and collars indicates that both a and A are normally distri- 
buted about respective means 10-02 cm and 10-17 cm with respective standard 
deviations 0-04 cm and 0-06 cm. If components are selected at random for 
assembly, find the percentage of the rods which are likely to be too big for the 
collars for which they are chosen. 

15 The heights of boys aged 13-14 years are normally distributed with mean 162 cm 
and standard deviation 6 cm. The heights of girls in the same age range are 
normally distributed with mean 158 cm and standard deviation 5 cm. Determine, 
to three decimal places, the probabilities of differences in height greater than 4 cm 
between 
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(a) 2 boys in the group, 

(b) 2 girls in the group. 

(c) a boy and a girl in the group. 

16 The lifetime, T hours, of a transistor is assumed to follow the exponential 
distribution with probability density function I(t) = \e _x ' (\ > 0, t > 0). Find 
the mean lifetime of a transistor and show that the probability of its surviving 
this length of time or longer is approximately 0-368. 

Given that the mean life is 2500 hours, find the number of hours for which the 
manufacturer should guarantee his transistors if he wants 98% of his output to 
satisfy his guarantee. 

17 Show that the variance of the uniform distribution of a variable which lies at 
random between —k and +k is & 2 /3. 

The weights of N passengers in an aircraft are to be totalled but before they are 
added, each weight is rounded off to the nearest 1 kg thus introducing an error 
which has a uniform distribution. Find the mean and variance of the total error of 
the sum. 

Assuming that, when N is large, the total error has a normal distribution, find 
the greatest value of N for which the probability that the total rounding off error 
lies outside the limits ±10 kg is less than 0-01. 

18 The weights of the males of a type of goldfish are normally distributed with mean 
30 g and variance 9 g 2 . The weights of the females are distributed N(25,4). Sketch 
the distribution of 

(a) the weights of a population of half male and half female goldfish, 

(b) the total weight of a pair of breeding goldfish. 

Find the probability that a fish drawn at random from population (a) will have 
a weight less than 25 g, and also the probability that the total weight of a pair 
drawn from population (b) will be less than 50 g. 

State any assumption which you have made. 

19 The probability density function f(f) of the length of life, T hours, of a make of 
television tube is given by 

f(0 = fce“* r , t 2* 0. 

Find the probability that a tube will last for 7\ hours more, given that it has 
already lasted for T 0 hours without failing. 

A certain shop has 3 similar television sets containing this make of tube working 
on display in the window. Find, in terms of T 0 and k , the probability that, owing 
to failure of the tube, exactly 1 set will fail in the first T 0 hours, another will fail in 
the next 2T 0 hours, and the third will last for more than 3T 0 hours. 
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Exercise 1.1 

1 1’J 

2 ( a ) 7> (b) Jy ( C ) T, (d) 7; 

(e)7 

3 (a)i (b)f, (c)£ 

4 (a)#, (b)# 


14 

(a) 

(b)^ 

15 

13 

136 


16 

(a) j, 

(b)| 

17 


(b) 5, (c) 3 

18 

00 Jr 

. ( b ) #- OH; Jr 

Exercise 2.2 



Exercise 1.2 

1 7 

2 (a)£, (b)f 

3 (a) j, (b)f, (c)f, (d)ii 

4 (a) f, (b) 10, (c) f 

Exercise 1.3 


3 (a) 0-025, (b) 0-4 


Exercise 1.4 


1 (a)f, (b)^ 



Miscellaneous Exercise 1 


^ ( a ) 100 ’ (b) UX) , (c) 1800 

3 0-823 


4 


3. 1. 1 
8 ’ 4 ’ 8 


5 (a)£, (b)-jL (c) 


29 

44 


7 0-018 

9 (a) 0-105, (b) 0-174, (c) 0-278, 
(d) size 2 


10 0-09 

11 0-52 


12 

13 


5 . \. 7 
18 ’ 3 ’ 18 

1- 1- 1 
5 ’ 5 ’ 5 


1 (a) 3-83; 2-81, (b) 0-3; 9-21 

3 £5200 

4 7P 

5 1-875 

Exercise 2.3 

A 4’ 32 

2 _7_- iL- 377 

^ 24 ’ 30 ’ 720 

3 F(x) = 0 for x ^ 0, 

x ^ 

= jforOCx^l, 

= C 2 ^ ~ 0 for 1 < x ^ 2, 

4 

_ — (jc 2 — 6jc -I- 5) 

4 

for 2< jc < 3, 

= 1 for x 25 3. 

4 (a) (b) -^forx>2, 

JC 

0 elsewhere. 

5 (a)£, (b)^, (c)^ 

Miscellaneous Exercise 2 

2 n , „ . ( k \ 

2 jvTi i 1 — 2k + I r I 
(2 N 1 - 1) \2" -1 / 

3 5-07; 3-62; (a) 5-15; 14-50, 

(b) 5-07; 148-59 

4 3-92, 5-58; 16-58, 139-41 

5 2;f 

6 (a) 12, (c)f;ir, (d)i 
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7 


8 

9 

10 

11 

12 

13 

14 

15 

16 


(a){, (b) 


(1 - cos 2) 



6. 4. _4_ 

7’ 7’ 21 

(*+ 1) 

(* + 2) 

_L. 5 
24 > 8 

_3_ 4-i9 
16 ’ ’ 80 

3 4 4 

4 5 5 5 25 

tV; 27 ; 0-792; 2-59 



0-54 hundred litres 

2f_ 

3 


Exercise 3.2 
2 

2 Loss of 

3 4; 8 

Exercise 3.3 

1 (a) 0-24, (b) 0*63 

2 0-75 

3 0-384; yes 

4 0-337 

5 (a) 0-0083, (b) 0-1935 


5 (b) 613/(5 9 ) 

6 0-180; £3; 20; 29 

7 1, 0-95; 0-494; 27-547 ~ 28 
9 0-9197; 0-00004 

10 (a) 0-444, (b) 0-392, (c) 0-165 

11 F(X = r) = — for r ^ 1; 

P(y = r) = | for r = 1,2,3 

12 (a) 0-995, (b) 0-857 

13 0-998 

14 (a) 0-665, (b) 0-737, (c) 0-015; 

5^-i 

^-7-; 6; 6; 30 
6 

15 (a) 0-8607, (b) 0-0005; 0-8614 
Exercise 4, 1 

1 j; F(*) = 0 for x < 0, 

= — for 0 x ^ 8 
8 

= 1 for jc > 8 

2 \\ 3; F(x) = Oforx < 2, 

- ~ for 2 =£ Jt =£ 4, 

2 

= 1 for x > 4 

3 j*, F(jc) = 0 for x < 0, 

= — for 0 ^ x ^ 4, 

4 

= 1 for x > 4 ; f 


Exercise 3.4 

3 (a)£, (b)^;2- 17 

4 0-019 

Exercise 3.5 

1 0-143 

2 0-097, 0-226, 0-264, 0-205, 0-120, 
0-056; 2 

3 0-849 

4 0-463; 0-558 

5 (a) 0-018, (b) 0-433; 

P ~ 0-08 yes 

6 (a) 0-040, (b) 0-045 

7 17 


Miscellaneous Exercise 3 

1 0-579 

2 0-212 

3 (a) 0-056, (b) 0-19; 0-53 

4 (a) 0-262, (b) 0-371 


Exercise 4.2 

1 0-793 

2 0-368 

3 2e -2 * for x ^ 0; e -1 ~ 0-368 

4 0-264 

5 1 — e -1 ; e -1 — e -2 ; e -2 — e -3 
Exercise 4.3 

1 (a) 0-2266, (b) 0-2902 

2 0-3085; 0-2112 

3 0-067 

4 0-186 

5 0-479 

6 (a) 4-78%, (b) 15-87%, 

(c) 4-32%, 

Miscellaneous Exercise 4 
~ tik 

1 ; k hours ; 0 • 174 ; 14 995 hours 

k 

2 (a) 0-3, (b) 0-6, (c) 0-2 
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3 

95%; 503-2 


13 

£5-34 


4 

(a) 0-0027, 

(b) 0-0451; 

14 

1-88% 



£9594-60 

15 

(a) 0-637, (b) 0-572, 

(c) 0-653 

5 

6 

(a) 0-350, 
k\ k 2 

(b) 364; 0-938 

16 

— ; 50-5 hours 


7 

243-42; (a) 10-9%, (b) 0-68% 


„ N „ 


8 

(a) 0-272, 

(b) 0-722 

17 

0, — ; 180 

12 


9 

— ; 50, 2500: 

; 0-549; 0-301 

18 

0-274; 0-083; independence of 


a 

; ~ 65 m 


weights of males and females 

10 

0-081; 0-775 

19 

e~ kT '- 


11 

12 

07.13 hours 
0-134; 3100 


6e - 4 «o(i _ e -* r °)(l - e 

-2kT 0 j 
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